Clearly, both technologies have the potential to be extremely valuable. Even more intriguing is their convergence. Today, AI is still a limited technology that can be said to be in its childhood. There are reasons to hope that blockchain can greatly speed up its development helping us usher in Industry 4.0.
In this article, we’ll look at why today’s AI is limited, how blockchain can help improve it, and what the far-reaching implications for all of us are.
Three Problems that Limit Modern AI
Today, artificial intelligence is mostly reliant on machine learning—a technology that allows software system to learn patterns by analyzing vast amounts of data with the help of complex mathematical models. Artificial neural networks are, probably, the most well-known subset of such models. Today’s advances in machine learning, specifically, deep learning, allow for things that were considered sci-fi not so long ago. We mean natural language processing, computer vision, predictive maintenance in manufacturing and other domains where AI gradually comes closer to human intelligence.
However, machine learning still suffers from three straightforward but hard-to-solve problems that hinder AI development in general.
The Black Box Problem
Deep learning networks are the most powerful but at the same time the least understandable machine learning models. A deep network consists of thousands or even millions of nodes grouped in multiple connected layers—one input layer, one output layer, and several hidden layers. Each node may be compared to a rotary switch that can be put in hundreds of positions called weights. When we say that a network “learns,” we mean that it fine-tunes the nodes in a way that makes it produce the right results in its output layer.
When a data scientist creates and trains a neural network, he or she knows the input—the data that was fed to the network—and the output—the results produced by the net. For example, the network was given a set of labeled pictures of cats and dogs. After training, we can see that the network can tell a dog from a cat in a random image that it has never “seen” before with an accuracy of 98%. But why was it exactly this configuration of node weights that allowed the net to achieve this result? And what is the reason for the rest 2% of fails?
It appears that we cannot yet answer these questions for sure. And this is called the black-box problem of machine learning and AI.
Image source: https://en.wikipedia.org/wiki/Black_box
Some data scientists argue that there are techniques that allow us to extract some complex logic from neural networks, but they are still far from giving clear answers like “the customer X has a credit score of 650 for this and that reasons.”
The black box problem slows down the adoption of AI in most industries where the cost of an error can be immense, such as healthcare or defense.
A neural network is only as good as the data that it has learned from. That’s why this data should satisfy multiple quality and quantity requirements, which can be hard to balance in many cases. A network trained on an insufficient amount of data will have low accuracy, while a network learned on unreliable data will be biased.
For example, one AI system that analyzed hundreds of thousands of articles learned to view the words victim and woman as synonyms despite (or, rather, due to) getting plenty of data. The reason was that the data itself was flawed—it was taken from tabloids where women too often appeared as victims of various crimes.
This problem overlaps with the previous one. When a neural network gets too much data, it starts finding random patterns that have no significance and lead to wrong conclusions. As a result, the AI system works well on historical data but starts making incorrect predictions on new data. In mathematical terms, overfitting means AI has a tendency to overlook off-the-curve data points.
Image source: https://numer.ai/whitepaper.pdf
An example of overfitting may be an AI-based credit scoring system that assigns higher scores to those who wear beards or have an SSN starting from 6. A set of similar funny random patterns often cited in articles that describe overfitting can be found on xkcd.
The worst part about overfitting is that data scientists sometimes introduce it intentionally. An overfitted model can show impressive results on historical data, which allows it to win machine learning competitions, for example. This way, data scientists are to some extent incentivized to overtrain their models as their performance isn’t assessed prospectively.
How Blockchain Technology Can Help Us “Fix” AI
A blockchain is an immutable ledger shared by multiple parties via a distributed network. Put more simply, a blockchain is a way to share records of all past events using a decentralized database.
Image source: http://mattturck.com/ai-blockchain/
AI and blockchain are based on polar concepts. One centralizes and analyzes vast amounts of data, the other is inherently decentralized. One suffers from the black box problem, the other is completely transparent. And so on. So how can we successfully combine them to cope with AI challenges?
Blockchain developers from Itransition cannot see an immediate way to integrate the two technologies into one chimeric solution. But we would argue that blockchain may help solve some of AI’s problems. Here’s how.
Blockchain and the Black Box Problem
A blockchain can give immutable and transparent records of all AI decisions. Seeing how each specific decision was made will allow human operators to pinpoint instances where they need to get involved—and highlight areas where AI fails. With a better understanding of AI’s decision-making process engineers will be able to create more reliable models and AI systems based on them.
Blockchain will not solve the black box problem as such but it will help us cope with the outcomes and make AI systems more predictable.
Blockchain and Data Reliability
A blockchain validates all the data by default—that’s what we love it for. Data taken from blockchain can be trusted. Besides, this data is already anonymized which potentially means more data available for analysis. This way, thanks to blockchain, both data quality and quantity improve, reducing the likelihood of mistakes.
Blockchain and Overfitting
Blockchain startup Numerai proposed a solution to the overfitting problem. They created Numeraire token which should act as an incentive for data scientists to create models that perform well on new data rather than historical.
According to the white paper published by Numerai, the idea is that data scientists will compete staking Numerarie on their models’ predictions. Their reward will depend on how correct these predictions are. This will allow for prospective rather than retrospective analysis of AI systems’ performance thus eliminating the temptation to overtrain a model to win a competition.
None of this is to say that blockchain tech is going to come in and solve all of AI’s current problems. But it can certainly break some of the barriers holding AI back today by helping with overfitting, poor data, and the black box problem.
We may not see a (beneficent) SkyNet or an Asimov-style world where robots do most of the work. But even from where we stand today, it’s clear that decentralized, transparent, immutable ledgers powered by thousands or millions of machines certainly raise the ceiling for where AI tech can go in the next decade.