Researchers Reduce Bias in aI Models while Maintaining Or Improving Accuracy
elliotarteaga урећивао ову страницу пре 2 месеци


Machine-learning models can fail when they try to make predictions for individuals who were underrepresented in the datasets they were trained on.

For example, a design that anticipates the very best treatment choice for somebody with a persistent illness may be trained using a dataset that contains mainly male patients. That model may make inaccurate forecasts for female clients when released in a health center.

To improve outcomes, engineers can try balancing the training dataset by removing data points till all subgroups are represented equally. While dataset balancing is appealing, it often needs getting rid of big amount of information, injuring the model’s total performance.

MIT researchers established a new method that recognizes and eliminates specific points in a training dataset that contribute most to a design’s failures on minority subgroups. By removing far fewer datapoints than other approaches, this method maintains the total precision of the design while improving its efficiency relating to underrepresented groups.

In addition, the method can determine concealed sources of predisposition in a training dataset that lacks labels. Unlabeled data are much more prevalent than labeled information for many applications.

This approach could also be integrated with other approaches to enhance the fairness of machine-learning designs released in high-stakes scenarios. For example, it may sooner or later help guarantee underrepresented patients aren’t misdiagnosed due to a prejudiced AI design.

“Many other algorithms that attempt to resolve this issue assume each datapoint matters as much as every other datapoint. In this paper, we are showing that assumption is not real. There specify points in our dataset that are contributing to this predisposition, and we can find those information points, remove them, and get better performance,” states Kimia Hamidieh, an electrical engineering and computer technology (EECS) graduate trainee at MIT and co-lead author of a paper on this method.

She the paper with co-lead authors Saachi Jain PhD ‘24 and library.kemu.ac.ke fellow EECS graduate trainee Kristian Georgiev