Racial bias in health care algorithms could be a driver of major disparities affecting Black communities

In Health and Healing by Irene Duah-Kessie

From self-driving cars to job shortlisting tools to law enforcement screening technologies, machine learning (ML) applications are being adopted across many sectors of society. The healthcare sector is no different. It has been embracing machine learning now more than ever to enhance the delivery of care, including diagnostic imaging, telehealth, clinical decision support, personalized care, and the list goes on. The multifaceted decisions that once took hours or days to address, such as breast or lung cancer diagnosis and patient discharge times, are now determined within seconds through machine learning applications. 

Machine learning is defined as a system that provides the ability to create algorithms that can learn from data and their performance without any explicit programming (Marr, 2016). An algorithm is a coded step-by-step instruction fed into a machine to make predictions. As health data continues to grow exponentially, the human capacity to remember relevant information and make decisions will be limited. As such, leveraging machine learning has enabled enhanced physician productivity, increased operational efficiency, and improved  quality of care and health for patients. 

Although machine learning algorithms offer exciting possibilities for our health care systems, these advancements are raising valid concerns about how data scientists and physicians can ensure fairness in these algorithms. Because machine learning algorithms self-regulate, these models have the ability to exacerbate biases in a feedback loop and to therefore normalize discriminatory decisions and processes (Benjamin, 2019). There are histories of discrimination that live within digital platforms and are embedded within the patient’s health data history, and if they go unquestioned, they become part of the logic in simple algorithms. In this piece, I will highlight a few significant examples of how racial biases in algorithms and medical technologies impact the health of Black patients and the socio-technical mechanisms to which bias arises.

Racial biases of machine learning in health 

There is growing evidence that machine learning algorithms used within health care have been systematically discriminating against Black patients. These alarming and consistent patterns of disparities reveal that these systems can be biased and are often rooted in the historic context of health care delivery as well as structural and social factors such as unequal insurance options and poor medical curricula. 

A 2019 study by Obermeyer et al. (2019) revealed how a commonly used algorithm in health care was less likely to refer Black patients than White patients into high-risk management programs. These programs are to improve care for those with complex medical needs, reduce future risks and therefore saving costs on the health care system. They found that the original algorithm enrolled 17.7 percent of Black patients into the program, yet when reassessed with a simulated algorithm, more than double Black patients (46.5 percent) actually required additional support. They also compared algorithmic predictions for total medical costs and found no differences between the two groups, however, at any given level of condition, Black patients generated lower medical costs. For example, if a Black and White patient had the same degree of renal failure, the Black patient would cost $1800 less on average per year. The problem is that the algorithm was coded to measure costs—which seems reasonable because sick people will cost more—but the reality is that the production of costs is biased because of how systemic racism and barriers have shaped medical care. Hence, it is important to assess the racial implications of any chosen variable, however, it may be even more critical to test a range of related variables and compare the differences of these implications, for less biased predictions. 

The disparities in program screening and overall access to care are often mediated via the social determinants of health. These are the social and economic factors that influence people’s health such as food and education. Studies have consistently shown that the overall health differences in Black communities are often linked to higher rates of precarious work, housing instability, and transportation issues, which can substantially reduce one’s access to health care services (Williams et al., 2019). Social barriers are also compounded by inequities in treatment, where Black patients are less likely to trust the medical system and have positive encounters with researchers or health care providers. Boag et al. (2019) used ML to quantify racial disparities during end-of-life care0 and found that Black patients yield higher mistrust levels. Their finding suggests that Black patients may experience suboptimal health outcomes due to poor communication and relationships with their doctors. It is important to acknowledge that this difference in positive and trusting experiences among Black and non-Black patients can be encoded in big data used to create and train algorithms. Because ML is only as valuable as the data it learns from, great attention must be paid to examining data sensitivity and specificity, across various criteria, for historically mistreated patients. 

Racial bias in medical devices

Medical devices powered by machine learning and technologies are also known to encode racial bias based on colour sensing. Studies show that medical diagnostic applications used to detect melanoma, a cancer cell responsible for skin-cancer related complications, is less accurate on darker skin tones. Adamson & Smith (2018) finds that the incidence rate for melanoma is relatively low in Black populations, but many are diagnosed too late, producing disparities in survival rates for Black patients (66 percent) compared to White patients (94 percent). 

One explanation for inadequate readings among darker-skinned people is the representation bias that exists within data used to develop and train these tools. This refers to when the training data itself is not representative of real-world applications (Vaughn et al., 2020). The lack of inclusion of diverse skins of colour in the development and training of ML is a major shortcoming of medical technologies but also speaks to the greater issue of image datasets that inherently reflect societal biases and historical practices of poor recruitment (Adamson & Smith, 2018). It also demonstrates some of the subtle ways white supremacy is built into our digital systems and infrastructures, especially the machines we depend on to quantify low oxygen and cancerous cells. This ideology is deeply rooted in the fabric of our society and is one that data scientists and technologists continue to neglect and allow to infiltrate our data sets and control the function of machine learning algorithms. 

Where do we go from here?

As medical practice exponentially increases its capacity through technology, it also exponentially increases its potential harm. The stakes are therefore high as critical decisions to our daily lives will become automated by data sets that likely reflect profound underlying injustices within society. The cultural complacency around addressing these bias challenges demonstrates the lack of research and investment into the poorer outcomes of Black patients. As a result, machine learning systems cannot accurately or appropriately measure the health and socioeconomic needs of Black communities. 

As Deborah Raji stated, “the fact is that [ML] does not work until it works for all of us”. If practitioners continue to accept a certain level of marginal error, we will never achieve health equity or racial justice. Rather, our world will be governed by tools that may routinely systematically exclude marginalized people from resources, opportunities, and inflict harm on generations to come. The studies discussed earlier demonstrate that the purpose of these algorithms is to improve health and allocate resources to those who are often underserved. There is a clear disconnect in the design of health care programs and fairness measures to ensure algorithms function in the way they are intended to operate. 

It is also quite striking that algorithms had been widely adopted in the healthcare field for so long before people began to ask questions, discuss their implications, or make efforts to prevent harm before they touched the lives of real people. To move from biased algorithms to socially conscious algorithms, we need to begin confronting symbolic hints of who is truly benefitting from our institutions and who is being left behind. Regulatory bodies, data scientists and developers require a paradigm shift in how they think about and define algorithms and incorporate robust auditing processes to ensure all possibilities of discrimination are considered before implementation. 

The medical community must rethink ideas of collective safety and justice through technology and develop contemporary ethics to re-imagine innovation in health technology. As much as medicine and technology are discussed in the field of science, the realities of machine learning in health show that there is an art to providing equitable care that requires continuous reflection and structural changes. 


Irene is a first-generation Ghanaian-Canadian, born and raised in Toronto. She is a Project Manager at Across Boundaries leading a food security and mental health initiative to support the Black community. She is also the Executive Director of Rise STEM, a grassroots organization aiming to increase access to STEM learning and career development opportunities for Black youth. Irene holds a Bachelors of Science from McMaster University, Master of Science in Sustainability Management from the University of Toronto, and is currently a fellow in the Leading Social Justice Fellowship at the University of Toronto’s School of Cities.


Adamson, A. & Smith, A. (2018). Machine Learning and Health Care Disparities in Dermatology. American Medical Association, 154:11, 1247-1248.

Benjamin, R. (2019). Assessing risk, automating racism. Social Science, 336:6464, 421-422. 

Boag, W., Suresh, H., Celi, L.A., Szolovits, P., & Ghassemi, M. (2019). Modeling Mistrust in End-of-Life Care. Cornell University.

Marr B. (2016) What is the difference between artificial intelligence and machine learning? Forbes. Retrieved from: https://www.forbes.com/sites/bernardmarr/2016/12/06/what-is-the-difference-between-artificial-intelligence-and-machine-learning/?sh=49da36d2742b

Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366:447-453.

Raji, D. (2020). How our data encodes systematic racism. MIT Technology Review. Retrieved from: https://www.technologyreview.com/2020/12/10/1013617/racism-data-science-artificial-intelligence-ai-opinion/

Vaughn, J., Vadari, M., Baral, A., & Boag, W. (2020). Dataset Bias in Diagnostic AI system: Guidelines for Dataset Collection and Usage. CHIL, Toronto, ON.

Williams, D.R, Lawrence, J.A, Davis, BA. (2019). Racism and health: evidence and needed research. Annual Review of Public Health, 40:105-25