Machine learning – why it’s important

For most of the information age, if you wanted to program a computer to solve a problem, you had available only two approaches – the lookup method and the heuristic method. Now the rise of accessible machine learning algorithms opens up a powerful new alternative. We can use it to solve problems we previously were unable to tackle.

To illustrate the power of machine learning, we will use this simple – but deceptively difficult – example: count the total number of syllables in a sentence.

One way of solving problems is to simply look up the answer from a pre-existing list. It seems we could find the number of syllables in the sentence by looking up each word in a dictionary to find its syllable count, then summing them. This intuitive approach breaks down quickly in practice. Real sentences contain a surprising number of words that do not appear in common dictionaries, such as foreign words, names, misspellings, pop culture references, and slang. Dictionaries often do not contain all variations of a word, and do not contain syllable counts for all words. If we cannot find even one word in the sentence, we cannot answer the question using the lookup method. For that reason, this method is very limited and cannot be applied to many problems.

Instead, most software uses heuristics, or “clever algorithms.” This method amounts to writing down, in a programming language, an “algorithm” or a set of instructions for the computer to follow. Heuristics are powerful and give great results – as long as you understand the problem well enough to write the algorithm. Unfortunately, they are brittle – the computer will do exactly what the instructions say. If your problem has many special cases, you must keep adding special logic to the algorithm to handle them. As the problem gets more subtle – such as examining natural language – heuristics break down and cannot be improved beyond a certain threshold. Past that point, the heuristic becomes snarled in a mess of contradictory special cases, and fixing one special case breaks others.

A syllable counting heuristic using common pronunciation rules for English spellings will get the syllable count correct for most sentences, but not all. Unlike the lookup method, it will at least always return an answer – but not always the right answer.

Machine Learning takes a completely different approach. Instead of writing an algorithm, the programmer chooses a machine learning model and presents it with a set of training data. The model adjusts itself to obtain the desired results, based on a feedback mechanism. A good model can pick up the implicit rules in the data – even if they are complex, and even if we do not understand the problem well enough to explicitly write down the rules. As long as the feedback mechanism is solid, the model can dynamically adjust itself, learning new variations of the data that did not exist when the system was originally trained. If we built a machine learning model that performed reasonably well at counting syllables from real sentences, and we continued to periodically give it feedback, we would expect it to slowly get better and better over time – even as new words entered the language from different sources.

Heuristics and machine learning, then, have opposite characteristics. A heuristic quickly provides very good results – but as the data volume increases, odd special cases are uncovered and its quality often plateaus. Improving the heuristic requires modifying its code and gets progressively more complicated and expensive. Past a certain point improvement may not be possible. A machine learning model, by comparison, may require more initial effort to create and train, but it can improve its own performance by learning. The improvement is driven by learning from seeing more data, not by redesigning the algorithm. Eventually the machine learning model can learn to handle even rules so subtle we cannot even articulate them explicitly. This is a fundamentally different capability that we had available with lookups or heuristics, and allows us to produce new kinds of software solutions that we previously could not get to work with only lookups or heuristics.

Machine learning is not a new approach, but its use is beginning to explode. This explosion is driven by a confluence of factors – improved training methods, increasing computer power, huge “big data” data sets that expose the flaws in our heuristics, and improved usability of machine learning toolkits.

The true value of machine learning is that it opens up an entirely new technique for solving problems that have proven intractable to lookup or heuristic-based approaches. We can now re-examine challenges that previously seemed out of reach. Expect to see the rise of a new generation of products that take advantage of this powerful new approach.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s