When algorithms are biased

A care management algorithm underestimated the needs of black patients.

An algorithm that predicts who would benefit the most from care management sounds like a helpful tool for hospitals and their complex, high-need patients. But recent research has proven that such calculations may be vulnerable to racial bias.

Courtesy of Dr Powers
Courtesy of Dr. Powers

ACP Resident/Fellow Member Brian Powers, MD, MBA, said he and his co-authors discovered bias in a nationwide algorithm largely by accident. While researching whether machine learning could improve high-risk care management programs, they noticed that an algorithm used to prioritize patients to care management was underestimating the needs of black patients.

“The risk scores that were being spit out for white patients were much higher than we would expect, especially relative to black patients,” said Dr. Powers, who is a third-year internal medicine resident at Brigham and Women's Hospital and Atrius Health in Boston. “We made this discovery at a time of growing concern around bias in algorithms and machine learning, which pushed us to dig deeper.”

If the observed bias was removed, the percentage of black patients receiving additional care management services in the health system they studied would have more than doubled, from about 17.7% to 46.5%, according to the results of the study, published in October 2019 in Science.

Dr. Powers and his coauthors have launched a pro bono initiative through the Center for Applied Artificial Intelligence at the Booth School of Business in Chicago to help identify and reduce algorithmic bias across health care delivery systems. He recently spoke to ACP Hospitalist about the importance of this type of work and about the implications of their study's findings.

Q: What is the algorithm used for, and how does it work?

A: The algorithm is used to prioritize patients for additional care management services. You can think of it as rank-ordering patients (or putting them in a line) according to their need for care management—those at the front of the line are more likely to get extra help. In a perfect world, the algorithm would directly predict a patient's need for care management. But that's hard to define, and even harder to measure. Instead, the algorithm predicts future health care costs. On first pass, this makes sense. We know that patients with poorly controlled chronic conditions and unstable drivers of health often fall through the cracks, ending up in the emergency department or the hospital and incurring high costs.

Q: Where did you find racial bias in the algorithm?

A: The algorithm was really good at what it was designed to do—predict future costs—and it did so just as well for black and white patients. There was no bias there. Bias crept in because cost was being used as a proxy for a patient's need for care management. We looked at other ways to define that need—flare-ups of chronic conditions, poorly controlled diabetes, hospital admissions, and emergency department visits. Across all of those different measures, the algorithm prioritized white patients over black patients. This is because for any given level of health or any given level of sickness, black patients incur fewer costs than white patients. That has nothing to do with the algorithm, but structural issues—unequal access, unequal treatment, and mistrust, to name a few.

Things broke down because the algorithm was built to predict cost but being used to predict something else.

Q: What are the implications of that bias?

A: To return to the line analogy, this bias meant that healthier white patients got to skip sicker black patients in line. In the health system we studied, correcting this bias would mean that twice as many black patients would be getting extra help managing their health.

Q: Are there potential ways that technology may be able to fix this problem?

A: It's a serious problem, but there are viable solutions. In this case, bias emerged because of picking the wrong label—choosing costs when the real goal was to predict the need for care management. So we asked, what if you just change the label? We conducted various simulations where, instead of predicting costs, we built algorithms that predicted different proxies. . . . We found that these algorithms exhibited significantly less racial bias.

Q: What do you think about the future of algorithms in health care?

A: I hope these findings are a call to action. We need to pause and conduct thorough assessments of the algorithms in use across health care delivery and elsewhere. Algorithms have become ubiquitous without undergoing comprehensive assessments for bias. That needs to change.

But I don't think our research should be interpreted as an indictment of algorithms in health care. For example, the problem the algorithm we studied was being used to solve—how to allocate complex, expensive care management resources across millions of patients—is not a task best equipped for physicians, administrators, or policymakers. We need algorithms to support decision making, and well-designed, well-implemented algorithms can actually be a path to reduce disparities and counteract bias.