Tuesday, October 25, 2016

The Cult of the Algorithm: Not So Fast, Folks

Sometimes I feel like Emily Litella in the old Saturday Night Live skit, huffing and puffing in offense while everyone wonders what I’m talking about.  That’s particularly true in the case of new uses of the word “algorithm.”  I find this in an interview by NPR of Cathy O’Neil, author of “Weapons of Math Destruction”, where part of the conversation is “We have these algorithms … we don’t know what they are under the hood … They don’t say, oh, I wonder why this algorithm is excluding women”.  I find this in the Fall 2016 Sloan Management Review, where one commenter says “developer ‘managers’ provide feedback to the workers in the form of tweaks to their programs or algorithms … the algorithms themselves are sometimes the managers of human workers.” 
As a once-upon-a-time computer scientist, I object.  I not only object, I assert that this is fuzzy thinking that will lead us to ignore the elephant in the living room of problems in the modeling of work/management/etc. to focus on the gnat on the porch of developer creation of software.
But how can I possibly say that a simple misuse of one computer term can have such large effects?  Well, let’s start by understanding (iirc) what an algorithm is.

The Art of the Algorithm

As I was taught it at Cornell back in the ‘70s (and I majored in Theory of Algorithms and Computing), an algorithm is an abstraction of a particular computing task or function that allows us to identify the best (i.e., usually, the fastest) way of carrying out that task/function, on average, in the generality of cases.  The typical example of an algorithm is one for carrying out a “sort”, whether that means sorting numbers from lowest to highest or sorting words alphabetically (theoretically, they are much the same thing), or any other variant.  In order to create an algorithm, one breaks down the sort into unitary abstract computing operations (e.g., add, multiply, compare), assigns costs to each, and then specifies the steps (do this, then do this).  Usually it turns out that one operation costs more than the others, and so sort algorithms can be reduced to considering the overall number of compares n as n increases from one to infinity.
Now consider a particular algorithm for sorting.  It runs like this:  Suppose I have 100 numbers to sort.  Take the first number in line, compare it to all the others, determine that it is the 23rd lowest.  Do the same for the second, third, … 100th number.  At the end, for any n, I will have an ordered, sorted list of numbers, no matter how jumbled the numbers handed to me are.
This is a perfectly valid algorithm.  It is also a bad algorithm.  For every 100 numbers, it requires at least 100 squared steps, and for any n, it requires at least n squared steps.  We say that this is an “order of n squared” or O(n**2) algorithm.  But now we know what to look for, so we take a different approach.
Here it is:  We go through the list of 100 numbers from both ends, and we find the maximum of the numbers from the low end and the minimum of the numbers from the high end, stopping when both sets of comparisons are dealing with the same number.  We then partition the list into two buckets, one containing the list up to that number (all of whose items are guaranteed to be less than that number), and one containing the list after that number (all of whose items are guaranteed to be greater than that number.  We repeat the process until we have reached buckets containing one number.  On average, there will be O(logarithm to the base two, or “log” of n) such splits, and each level of split performs O(n) comparisons.  So the average number of comparisons in this sorting algorithm is O(n times log n), called O(n log n) for short, which is way better than O(n**2) and explains why the algorithm is now called Quicksort.
Notice one thing about this:  finding a good algorithm for a function or task says absolutely nothing about whether that function or task makes sense in the real world.  What does the heavy lifting of creating a new program that is useful is more along the lines of a “model”, implicit in the mind of the company or person driving development, or additionally explicit in the program software actually carrying out the model.   An algorithm doesn’t say “do this”; it says, “if you want to do this, here’s the fastest way to do it.”

Algorithms and the Real World

So why do algorithms matter in the real world?  After all, any newbie programmer can write a program using the Quicksort algorithm, and there are a huge mass of algorithms available for public study in computer-science journals and the like.  The answer, I believe, lies in copyright and patent law.  Here, again, I know somewhat of the subject, because my father was a professor of copyright law and I held some conversations with him as he grappled with how copyright law should deal with computer software, and also because in the ‘70s I did a little research into the possibility of getting a patent on one of my ideas (it was later partially realized by Thinking Machines).
To understand how copyright and patent law can make algorithms matter, imagine that you are Google, 15 or so years ago.  You have a potential competitive advantage in your programs that embody your search engine, but what you would really like is to turn that temporary competitive advantage into a more permanent one, by patenting some of the code (not to mention copyrighting it to prevent disgruntled employees from using it in their next job).  However, patent law requires that this be a significant innovation.  Moreover, if someone just looks at what the program does and figures out how to mimic it with another search engine set of programs (a process called “reverse engineering”), then that does not violate your patent.
However, suppose you come up with a new algorithm?  In that case, you have a much stronger case for the program embodying that algorithm being a significant innovation (because your program is faster and [usually] therefore can handle many more petabytes or thousands of users), and the job of reverse engineering the program becomes much harder, because the new algorithm is your “secret sauce”. 
That means, if you are Google, that your new algorithm becomes the biggest secret of all, the piece of code you are least likely to share with the outside world – outsiders can’t figure out what is going on.  And all the programs written using the new algorithm likewise become much more “impenetrable”, even to many of the developers writing them.  It’s not just a matter of complexity; it’s a matter of preserving some company’s critical success factor.  Meanwhile, you (Google) are seeing if this new algorithm leads to another new algorithm – and that compounds the advantage and secrecy.
Now, let me pause here to note that I really believe that much of this problem is due to the way patent and copyright law adapted to the advent of software.  In the case of patent law, the assumption used to be that patents were on physical objects, and even if it was the idea that was new, the important thing was that the inventor could offer a physical machine or tool to allow people to use the invention.  However, software is “virtual” or “meta” – it can be used to guide many sorts of machines or tools, in many situations; at its best, it is fact a sort of “Swiss Army knife”.  Patent law has acted as if each program was physical, and therefore what mattered was the things the program did that hadn’t been done before – whereas if the idea was what mattered, as it does in software, then a new algorithm or new model should be what is patentable, not “the luck of tackling a new case”.
Likewise, in copyright law, matters were set up so that composers, writers, and the companies that used them had a right to be paid for any use of material that was original – it’s plagiarism that matters.  In software, it’s extremely easy to write a piece of a program that is effectively identical to what someone else has written, and that’s a Good Thing.  By granting copyright to programs that just happened to be the first time someone had written code in that particular way, and punishing those who (even if they steal code from their employer) could very easily have written that code on their own, copyright law can fail to focus on truly original, creative work, which typically is associated with new algorithms.
[For those who care, I can give an example from my own experience.  At Computer Corp. of America, I wrote a program that incorporated an afaik new algorithm that let me take a page’s worth of form fields and turn it into a good imitation of a character-at-a-time form update.  Was that patentable?  Probably, and it should have been.  Then I wrote a development tool that allowed users to drive development by user-facing screens, program data, or the functions to be coded in the same general way – “have it your way” programming.  Was that patentable? Probably.  Should it have been?  Probably not:  the basic idea was already out there, I just happened to be the first to do it.]

It's About the Model, Folks

Now let’s take another look at the two examples I cited at the beginning of this post.  In the NPR interview, O’Neil is really complaining that she can’t get a sense of what the program actually does.  But why does she need to see inside a program or an “algorithm” to do that?  Why can’t she simply have access to an abstraction of the program that tells her what the program does in particular cases?
In point of fact, there are plenty of such tools.  They are software design tools, and they are perfectly capable of spitting out a data model that includes outputs for any given input.  So why can’t Ms. O’Neil use one of those? 
The answer, I submit, is that companies developing the software she looks at typically don’t use those design tools, explicitly or implicitly, to create programs.  A partial exception to this is in the case of agile development.  Really good agile development is based on an ongoing conversation with users leading to ongoing refinement of code – not just execs in the developing company and execs in the company you’re selling the software to, but ultimate end users.  And one of the things that a good human resources department and the interviewee want to know is exactly what the criteria are for hiring, and why they are valid.  In other words, they want a model of the program that tells them what they want to know, not dense thickets of code or even of code abstractions (including algorithms).
My other citation seems to go to the opposite extreme:  to assume that automation of a part of the management task using algorithms reflects best management practices automagically, as old hacker jargon would put it.  But we need to verify this, and the best way, again, is to offer a design model, in this case of the business process involved.  Why doesn’t the author realize this?  My guess is that he or she assumes that the developer will somehow look at the program or algorithm and figure this out.  And my guess is that he/she would be wrong, because often the program involves code written by another programmer, about which this programmer knows only the correct inputs to supply, and the algorithms are also often Deep Dark Secrets.
Notice how a probably wrong conception of what an algorithm is has led to attaching great importance to the algorithm involved, and little to the model embodied by the program in which the algorithm occurs.  As a result, O’Neil appears to be pointing the finger of blame at some ongoing complexity that has grown like Topsy, rather than at the company supplying the software for failing to practice good agile development.  Likewise, the other cite’s belief in the magical power of the algorithm has led him/her to ignore the need to focus on the management-process model in order to verify the assumed benefits.  As I said in the beginning, they are focusing on the gnat of the algorithm and ignoring the elephant of the model embodied in the software.

Action Items

So here’s how such misapprehensions play out for vendors on a grand scale (quote from an Isabella Kaminska article excerpted in Prof. deLong’s blog, delong.typepad.com):  “You will have heard the narrative.... Automation, algorithms and robotics... means developed countries will soon be able to reshore all production, leading to a productivity boom which leads to only one major downside: the associated loss of millions of middle class jobs as algos and robots displace not just blue collar workers but the middle management and intellectual jobs as well. Except... there’s no quantifiable evidence anything like that is happening yet.”  And why should there be?  In the real world, a new algorithm usually automates nothing (it’s the program using it that does the heavy lifting) and the average algorithm does little except give one software vendor a competitive advantage over others.
Vendors of, and customers for, this type of new software product therefore have an extra burden:  ensuring that these products deliver, as far as possible, only verifiable benefits for the ultimate end user.  This is especially true of products that can have a major negative impact on these end users, such as hiring/firing software and self-driving cars.  In these cases, it appears that there may be legal risks, as well:  A vendor defense of “It’s too complex to explain” may very well not fly when there are some relatively low-cost ways of providing the needed information to the customer or end user, and corporate IT customers are likewise probably not shielded from end user lawsuits by a “We didn’t ask” defense.
Here are some action items that have in the past shown some usefulness in similar cases:
·         Research software design tools, and if possible use them to implement a corporate IT standard of providing documentation at the “user API” level specifying in comprehensible terms the outputs for each class of input and why.
·         Adopt agile development practices that include consideration and documentation of the interests of the ultimate end users.
·         Create an “open user-facing API” for the top level of end-user-critical programs, that allows outside developers to (as an intermediary) understand what’s going on, and as a side-benefit to propose and vet extensions to these programs.  Note that in the case of business-critical algorithms, this trades a slight increase in the risk of reverse engineering for a probable larger increase in customer satisfaction and innovation speedup.
Above all, stop deifying and misusing the word “algorithm.”  It’s a good word for understanding a part of the software development process, when properly used.  When improperly used – well, you’ve seen what I think the consequences are and will be.