What is ‘Causal Inference’ and Why is it Key to Machine Learning?

causal inference machine learning

Unlike human beings, machine learning algorithms are bad at determining what’s known as ‘causal inference,’ the process of understanding the independent, actual effect of a certain phenomenon that is happening within a larger system. For example, a human watching a golfer swing a golf club intuitively understands that the golfer’s arms are causing the club to swing, rather than the other way around. 

While machine learning algorithms are much better than humans at parsing out small patterns in massive data sets, they struggle to comprehend these minor causal relationships. However, a recent paper from researchers at the Montreal Institute for Learning Algorithms (Mila), Max Planck Institute for Intelligent Systems, and Google Research, poses some potential solutions.

Why is Machine Learning Bad at Causal Inference?

In the study, titled “Towards Causal Representation Learning,”  the researchers explain that machine learning struggles with causality because it relies heavily on large predefined sets of data. Furthermore, entering sets with multiple examples enhances accuracy.

“Machine learning often disregards information that animals use heavily: interventions in the world, domain shifts, temporal structure—by and large, we consider these factors a nuisance and try to engineer them away. In accordance with this, the majority of current successes of machine learning boil down to large scale pattern recognition on suitably collected independent and identically distributed (i.i.d.) data,” the researchers write.

Commonly applied in machine learning, i.i.d assumes that random observations in a set of data aren’t dependent on one another and have a continual probability of occurring. For example, when tossing dice, each flip is independent of the previous one. Therefore, the probability of every outcome stays constant.  

In order to apply i.i.d, machine learning engineers have traditionally trained their models on bigger and bigger compilations of examples. Based on the larger sample sizes, it’s assumed the model will successfully encode the general distribution of the problem into its parameters. However, in real-life scenarios, distributions can suddenly change. Examples include when convolutional neural networks trained on millions of images stop being able to see objects under new lighting, angles, or backgrounds.

“Generalizing well outside the i.i.d. setting requires learning not mere statistical associations between variables, but an underlying causal model,” the researchers write.

Causality can help solve machine learning’s struggle with generalization because it stays consistent even with subtle changes to a problem’s distributions. 

“It is fair to say that much of the current practice (of solving i.i.d. benchmark problems) and most theoretical results (about generalization in i.i.d. settings) fail to tackle the hard open challenge of generalization across problems,” the researchers write.

How Causal Inference Can Improve Machine Learning

The researchers pose several ways to develop causal machine learning models, two of which include “structural causal models” and “independent causal mechanisms.”

Rather than relying on fixed correlations between data sets, these models allow the AI system to understand both the causal variables and their effects on the environment. This would allow the system to identify objects regardless of subtle changes. 

The researchers also discuss embedding structural causal models into larger machine learning models “whose inputs and outputs may be high-dimensional and unstructured, but whose inner workings are at least partly governed” by a structural causal model.

“The result may be a modular architecture, where the different modules can be individually fine-tuned and re-purposed for new tasks,” they write. 

The researchers address some limitations of their proposals. Examples include the difficulty of inferring abstract causal variables from the available low-level input features, a lack of consensus around which aspects of the data expose causal relations, and the non-traditional methods needed to train such models. 

Until now, the field of machine learning has largely neglected causality. While there currently are some limitations, the researchers argue that causality is likely essential to most forms of animate learning, and that the field has much to benefit from its integration. 

Understanding Machine Learning

By providing AI with the ability to learn from its experiences without needing explicit programming, machine learning is important to developing the technology.

Covering machine learning models, algorithms, and platforms, Machine Learning: Predictive Analysis for Business Decisions, is a five-course program from IEEE.

Connect with an IEEE Content Specialist today to learn more about this program and how to get access to it for your organization.

Interested in the program for yourself? Visit the IEEE Learning Network.


Gandharv, Kumar. (24 March 2021). Causal Representation Is Now Getting Its Due Importance In Machine Learning. Analytics India Mag.

Dickson, Ben. (15 March 2021). Why machine learning struggles with causality. TechTalks.

, , ,

No comments yet.

Leave a Reply