Machine learning is a family of techniques that use data and experiences to create an interpretation of patterns in the world and use that interpretation to answer questions about the world.
Machine learning applications start as simple questions that can be phrased in terms of a probability, like “What are the chances that this image contains a dog?”, “Which move gives me the best chance of winning this game of GO?”, or “What does a random new celebrity’s face look like?”. These questions can be expressed mathematically using the language of statistics. These statements about probability are combined with data to generate a model, which is a mathematical interpretation of the patterns in the data. To answer the question of whether an image contains a dog, a machine learning algorithm might look at a million pictures of dogs and a million pictures without dogs and give back a model that can estimate the probability that an image contains a dog.
A human being can write a program to control how a website works or how a bank transaction operates. However, a person would find it impossible to write down explicit rules for more complex tasks, like how a car should drive itself. Machine learning enables practitioners to solve problems where explicit rules are unworkable by leveraging data and experience about the problem.
Practitioners in the field of machine learning continuously work on improving the state of the art and finding new applications. In a 2019 article in Nature Medicine a group of researchers found that a type of machine learning, called deep learning, could be used to detect lung cancer at a success rate similar to human medical professionals, and in some cases with fewer false positives and false negatives. Lung cancer resulted in an estimated 160,000 deaths in 2018. This automated approach to lung cancer detection could increase adoption and accuracy of screenings, potentially leading to saved lives.
As with any other tool, machine learning has its limitations. Machine learning algorithms require data, sometimes in large amounts, to be able to operate. Bias in this data may also show up the results. For example, a machine learning algorithm tasked on predicting whether an image contains a dog might misclassify a picture of a chihuahua as a non-dog if the data used to build the model only contained larger breeds.