What is machine learning?
Machine learning is a form of artificial intelligence that is able to learn without explicit programming by a human.
For most of our history, we’ve thought that learning—the ability to adjust our behaviour based on collected information—was something only humans did. The past few decades have changed all that. We now know that animals of all kinds learn from experience, teaching, and even play. But it is not only animals that learn: there’s increasing evidence that plants do,too .
And if you’ve ever unlocked a phone with facial recognition, or interacted with a virtual assistant, you’ve experienced firsthand that machines, too, are capable of learning.
Machine learning is a form of artificial intelligence (AI) that can adapt to a wide range of inputs, including large data sets and human instruction. Some machine learning algorithms are specialised in training themselves to detect patterns; this is called deep learning. The term “machine learning” was first coined in 1959 by computer scientist Arthur Samuel, who defined it as “a computer’s ability to learn without being explicitly programmed.”
It follows, then, that machine learning algorithms are able to detect patterns and learn how to make predictions and recommendations by processing data and experiences, rather than by receiving explicit programming instruction. The algorithms also adapt in response to new data and experiences to improve over time.
How did machine learning evolve into generative AI?
Machine learning as a discipline was first introduced in 1959, building on formulas and hypotheses dating back to the 1930s. But it wasn’t until the late 1990s that machine learning truly flowered, as steady advances in digitisation, computing languages capable of greater nuance, and cheaper computing power and memory enabled data scientists to train machine learning models to independently learn from data sets rather than rely on rules written for them.
The broad availability of inexpensive cloud services later accelerated advances in machine learning even further.
Deep learning is a more advanced version of machine learning that is particularly adept at processing a wider range of data resources (text as well as unstructured data including images), requires even less human intervention, and can often produce more accurate results than traditional machine learning.
Deep learning uses neural networks—based on the ways neurons interact in the human brain—to ingest and process data through multiple neuron layers that can recognize increasingly complex features of the data. For example, an early neuron layer might recognize something as being in a specific shape; building on this knowledge, a later layer might be able to identify the shape as a stop sign.
Similar to machine learning, deep learning uses iteration to self-correct and to improve its prediction capabilities. Once it “learns” what a stop sign looks like, it can recognize a stop sign in a new image.
This technological advancement was foundational to the AI tools emerging today. ChatGPT, released in late 2022, made AI visible—and accessible—to the general public for the first time. ChatGPT, and other language models like it, were trained on deep learning tools called transformer networks to generate content in response to prompts.
Transformer networks allow generative AI (gen AI) tools to weigh different parts of the input sequence differently when making predictions. Transformer networks, comprising encoder and decoder layers, allow gen AI models to learn relationships and dependencies between words in a more flexible way compared with traditional machine and deep learning models.
That’s because transformer networks are trained on huge swaths of the internet (for example, all traffic footage ever recorded and uploaded) instead of a specific subset of data (certain images of a stop sign, for instance). Foundation models trained on transformer network architecture—like OpenAI’s ChatGPT or Google’s BERT—are able to transfer what they’ve learned from a specific task to a more generalised set of tasks, including generating content.
At this point, you could ask a model to create a video of a car going through a stop sign.
Foundation models can create content, but they don’t know the difference between right and wrong, or even what is and isn’t socially acceptable. When ChatGPT was first created, it required a great deal of human input to learn.
OpenAI employed a large number of human workers all over the world to help hone the technology, cleaning and labelling data sets and reviewing and labelling toxic content, then flagging it for removal. This human input is a large part of what has made ChatGPT so revolutionary.
What kinds of neural networks are used in deep learning?
- Feed-forward neural network. In this simple neural network, first proposed in 1958, information moves in only one direction: forward from the model’s input layer to its output layer, without ever traveling backward to be reanalysed by the model. That means you can feed, or input, data into the model, then “train” the model to predict something about different data sets.
- As just one example, feed-forward neural networks are used in banking, among other industries, to detect fraudulent financial transactions. Here’s how it works: first, you train a model to predict whether a transaction is fraudulent based on a data set you’ve used to manually label transactions as fraudulent or not. Then you can use the model to predict whether new, incoming transactions are fraudulent so you can flag them for closer study or block them outright.
- Convolutional neural network (CNN). CNNs are a type of feed-forward neural network whose connectivity connection is inspired by the organisation of the brain’s visual cortex, the part of the brain that processes images.
- As such, CNNs are well suited to perceptual tasks, like being able to identify bird or plant species based on photographs.Business use cases include diagnosing diseases from medical scans or detecting a company logo in social media to manage a brand’s reputation or to identify potential joint marketing opportunities.
Here’s how they work:
-
- First, the CNN receives an image—for example, of the letter “A”—that it processes as a collection of pixels.
- In the hidden layers, the CNN identifies unique features—for example, the individual lines that make up the letter “A.”
- The CNN can then classify a different image as the letter “A” if it finds that the new image has the same unique features previously identified as making up the letter.
- Recurrent neural network (RNN). RNNs are artificial neural networks whose connections include loops, meaning the model both moves data forward and loops it backward to run again through previous layers.
- RNNs are helpful for predicting a sentiment or an ending of a sequence, like a large sample of text, speech, or images. They can do this because each individual input is fed into the model by itself as well as in combination with the preceding input.
Which sectors can benefit from machine learning?
- Predictive maintenance. This use case is crucial for any industry or business that relies on equipment. Rather than waiting until a piece of equipment breaks down, companies can use predictive maintenance to project when maintenance will be needed, thereby reducing downtime and lowering operating costs. Machine learning and deep learning have the capacity to analyse large amounts of multifaceted data, which can increase the precision of predictive maintenance. For example, AI practitioners can layer in data from new inputs, like audio and image data, which can add nuance to a neural network’s analysis.
- Logistics optimsation. Using AI to optimise logistics can reduce costs through real-time forecasts and behavioural coaching. For example, AI can optimize routing of delivery traffic, improving fuel efficiency and reducing delivery times.
- Customer service. AI techniques in call centres can help enable a more seamless experience for customers and more efficient processing. The technology goes beyond understanding a caller’s words: deep learning analysis of audio can assess a customer’s tone. If the automated call service detects that a caller is getting upset, the system can reroute to a human operator or manager.
Examples of organisations using machine learning?
- More than a dozen European banks have replaced older statistical-modelling approaches with machine learning techniques. In some cases, they’ve experienced 10 percent increases in sales of new products, 20 percent savings in capital expenditures, 20 percent increases in cash collections, and 20 percent declines in churn.
- Vistra, a large US-based power producer, built and deployed an AI-powered heat rate optimiser based on a neural network model. The model combed through years of data to help Vistra attain the most efficient thermal efficiency of a specific power plant.
How can mainstream organisations capture the full potential of machine learning?
- Reimagine challenges as machine learning problems. Not all business problems are machine learning problems. But some can be reframed as machine learning problems, which can enable novel approaches to creating solutions. This requires appropriate data sources, as well as clear definitions of ideal outcomes and objectives.
- Put machine learning at the core of enterprise architecture. Organizations can put machine learning at the core of their enterprise tech platforms, not as an auxiliary to systems architectures built around rules-based logic.
Develop a human-centred talent strategy. To capture these possibilities, enterprises need workforces capable of guiding technological adoption and proactively shaping how employees use new AI tools.