Machine learning is quickly becoming one of the most popular technologies that companies are investing in. Experts are growing increasingly worried that these models have a dangerous propensity for making mistakes when it comes to applications such as image recognition software used to diagnose illnesses, or surveillance software used to recognize human faces. However, advancements in machine learning may soon help reduce bias in these systems.
Data Diversity Key to Overcoming Bias in Neural Networks
A team of researchers from MIT and Harvard have found that training machine learning models on diverse sets of data can help them reduce bias, MIT News reports. Data sets that contain limited data are much more likely to discriminate when they make decisions. For example, facial recognition systems trained on data sets containing images of mostly white men are much more likely to give incorrect results when given images featuring women and people of color.
Relying on a method that used controlled data sets, the researchers sought to learn how training data impacts whether an artificial neural network (a machine learning model that uses brain-like nodes to process data) can figure out how to recognize new objects.
The researchers created data sets that contained an equal number of images of various objects in different positions (for example, photos of a car from multiple angles). They made some of these data sets more diverse by displaying the images from different points of view. Machine learning models the researchers trained on the more diverse data sets were better at generalizing new viewpoints. The result supports the idea that data diversity is necessary for overcoming bias. However, the researchers also found that the better a model gets at recognizing new objects, the worse it gets at recognizing objects it has already seen.
“A neural network can overcome dataset bias, which is encouraging,” Xavier Boix, a research scientist and senior author of the paper, told MIT News. “But the main takeaway here is that we need to take into account data diversity. We need to stop thinking that if you just collect a ton of raw data, that is going to get you somewhere. We need to be very careful about how we design data sets in the first place.”
The team also found that training a model separately for individual tasks, rather than training a model for each task at the same time, helped models become less biased. This largely has to do with neuron specialization. During separate training, neural networks produce two different kinds of neurons, which Boix finds fascinating. One neuron becomes good at recognizing object categories, and the other learns how to recognize viewpoints. Conversely, if these neurons are trained simultaneously, they can become diluted and confused.
Machine learning has come a long way, but there is still much to learn in order to develop the field. While the technology is promising, organizations should take steps to ensure they are doing their best to prevent bias in the systems they use or create.
What Uses Do You Predict Machine Learning Will Have in Your Company?
By providing AI with the ability to learn from its experiences without needing explicit programming, machine learning plays a critical role in developing the technology. Covering machine learning models, algorithms, and platforms, Machine Learning: Predictive Analysis for Business Decisions, is a five-course program from IEEE.
Connect with an IEEE Content Specialist today to learn more about this program and how to get access to it for your organization.
Interested in the program for yourself? Visit the IEEE Learning Network.
Zewe, Adam. (21 February 2022). Can machine-learning models overcome biased datasets? MIT News.