Data Quality in Relation to Algorithmic Bias ByGareth Hagger-Johnson

width= Dr Gareth Hagger-Johnson is a data professional with over 15 years of experience in data analysis, research methodology, and leadership roles. Gareth has made significant contributions across diverse industries, and has a particular interest in methodology and in issues of representation of minority groups in statistical analysis.
Algorithmic bias remains a pervasive issue in AI, impacting minority groups in areas such as housing, health and education. In our latest post, Gareth Hagger-Johnson explores the factors that perpetuate this bias. Data quality, he argues, is an essential consideration for creating fair and trustworthy AI applications:

“The pervasive issue of algorithmic bias, with its documented consequences particularly affecting minority groups in areas such as housing, banking, health, and education, has spurred increased attention and scrutiny”

Algorithmic bias concerns unfair decisions about real people, which explains why there is much interest in avoiding it. These concerns are not new and are not specific to AI algorithms – traditional algorithms have long been shown to produce biased predictions or classification decisions. The pervasive issue of algorithmic bias, with its documented consequences particularly affecting minority groups in areas such as housing, banking, health, and education, has spurred increased attention and scrutiny (Chin, 2023). Recognised for its inherent unfairness, algorithmic bias manifests when automated systems generate decisions that disproportionately impact individuals based on their demographic characteristics. This unfairness becomes apparent when a biased algorithm yields distinct scores or classification decisions for individuals who share identical input data. For instance, if an algorithm demonstrates a propensity to deny loans to ethnic minorities with equivalent credit scores, it is deemed biased. This unfairness extends beyond mere disparities and can be characterised by psychometric bias, as observed in aptitude tests where certain test items generate different scores for individuals with the same underlying abilities but belonging to different population groups, a phenomenon known as ‘differential item functioning’. The increasing awareness of these challenges has driven a growing interest in addressing and mitigating algorithmic bias to foster equitable decision-making in various domains. AI has existed for decades and is often an extension of traditional techniques connected to statistics and other disciplines with application in financial services and health (Ostmann, 2021). Recent advantages are innovative but are often incremental improvements, not seismic shifts (Ostmann, 2021).

Bias within algorithms can emerge from a multitude of factors, not limited to the algorithm’s design or unintended usage. Critical contributors include decisions surrounding how data is coded, collected, selected, and for AI algorithms, utilised in the algorithm’s training process (Chin, 2023). The data fed into algorithm design becomes a pivotal factor in shaping biases within the system. This bias may originate from pre-existing cultural, social, or institutional expectations, technical constraints inherent in the algorithm’s design, or its application in unforeseen contexts or by audiences not initially considered during the design phase. The widespread nature of algorithmic bias is evident across various platforms, including search engines and social media. The impacts are far-reaching, extending from inadvertent privacy violations to the perpetuation of social biases linked to race, gender, sexual orientation, and ethnicity. It underscores the critical importance of addressing bias not only in the algorithmic design but also in the meticulous curation and handling of data throughout the training process. Data quality issues have consistently been shown to prevent optimal use of AI (Ostmann, 2021). Data quality has not received enough attention in relation to algorithmic bias – the algorithms themselves tend to be the focus. Data quality is necessary but not sufficient for unbiased prediction and classification decisions. Data quality encompasses the accuracy, completeness, consistency, and reliability of the data used in machine learning algorithms. While algorithmic bias has traditionally been a focal point, with efforts directed toward fine-tuning models and employing fairness-aware techniques, the significance of data quality in influencing algorithm outcomes cannot be overstated.

For instance, if a hiring algorithm is trained on historically biased data where underrepresented groups are systematically excluded, the algorithm, despite its high quality, perpetuates these biases in predictions.

Biased data acts as a bottleneck, hindering the algorithm’s capacity to deliver fair and unbiased results. Additionally, biased data may introduce or reinforce stereotypes. To comprehensively address algorithmic bias, it is essential to scrutinise and rectify biases within the training data, identifying and mitigating biases, ensuring representativeness across demographics, and incorporating fairness considerations during data collection and preprocessing. In summary, while data quality is foundational for building robust machine learning models, its assurance alone does not ensure unbiased predictions. A holistic approach involves improving training data quality alongside algorithmic enhancements, recognising both as interdependent components in the pursuit of creating equitable and unbiased AI systems.

Utilising algorithms embedded with inherent biases and coupling them with poor-quality data creates a compounding effect, exacerbating and amplifying the existing biases entrenched within the algorithmic decision-making process. Poor data quality becomes a pivotal factor in this equation, acting as a catalyst for generating unfavourable and skewed outcomes, with its detrimental impact being particularly pronounced among minority groups. The uneven prevalence of inaccuracies or omissions in the data pertaining to these groups contributes to the disproportionate amplification of biases. For instance, in the United States, Hispanic immigrants may face challenges in obtaining accurate social security numbers, introducing inaccuracies into the data. When data having underdone record linkage algorithms are analysed, Hispanic adults appear to live longer than non-Hispanic whites which is the reverse of understood patterns of health – termed the epidemiologic paradox. This is due to data quality issues among Hispanic health records (Lariscy, 2011) and surname conventions that reduce the likelihood of data linkage. Names are more likely to be transcribed incorrectly by third parties for ethnic minorities leading to increased risk of linkage error (Bhopal, 2010). In the United Kingdom, ethnic minorities are more likely to encounter missing NHS numbers in their hospital records, further diminishing data quality and record linkage – producing biased estimates of readmission rates (Hagger-Johnson, 2017). There is variation in data quality at the source. Variation in data quality between hospitals is comparable in size to the difference between Asian and white groups in relation to missed matches (Hagger-Johnson, 2015). It is crucial to acknowledge that the quality of data might partially reflect the quality of the interaction between minority groups and healthcare services, as difficulties in obtaining accurate identification numbers may stem from systemic issues in the healthcare system.

More fundamental than poor quality data among minority or vulnerable groups is the decision not to measure group membership at all. As put by Karvonen: ‘Having accurate data is a key first step in addressing health inequities, since what is measured influences what is done’ (Karvonen et al., 2024). Put differently, omission is oppression. The exclusion of certain groups from data can perpetuate and reinforce existing inequalities and power imbalances, rendering them invisible. The amplification of bias becomes even more pronounced when information about an individual’s minority status is either missing, incorrect, or inconsistent. In one study with colleagues from UCL, we found that the largest proportion of missed matches occurred when ethnic minority status was missing (Hagger-Johnson et al., 2017). In Canada and France, national health databases do not record ethnicity (Naza et al., 2023). Concrete data quality issues, such as incomplete or inaccurate identification data, contribute significantly to biased algorithmic decisions (Lariscy, 2011). Addressing algorithmic bias necessitates a comprehensive approach that encompasses both refining the algorithms themselves and rectifying the underlying data quality issues, particularly those affecting marginalised communities, to foster fairness and equity in automated decision-making processes. Even small amounts of linkage error can produce biased results (Neter et al., 1965). It is challenging to evaluate algorithmic bias, partly because of commercial sensitivities around data and algorithms, but also because data on protected characteristics (e.g. sexual orientation, religion) might not be available for analysis.

There is documented evidence that women and ethnic minorities find it more difficult to access credit, although this appears to be mostly attributable to real differences in credit risk factors than active discrimination, which has been outlawed. The Markup’s investigation into lending decisions based on ethnicity, analysing over two million conventional mortgage applications in 2019 (Martinez, 2021), suggested lenders were more likely to deny home loans to black compared to white applicants with similar financial profiles, but was criticised for not including one of the key data points – credit score. Much of the apparent disparities in lending decisions by ethnic group is accounted for by genuine differences in credit risk data –white applicants had higher credit scores (Bhutta, 2022). And there are legal safeguards against treating applicants differently based on race or ethnicity. This does not mean that other variables don’t impact ethnic minorities. For example, they may be more likely to have unpredictable and riskier incomes, live in areas with fewer branches, or have less intergenerational wealth to draw on. Nonetheless, there remain some unobservable characteristics which influence credit risk decisioning and there are manual decisions made by underwriters subsequent to an initial approval. A 2018/2019 study of nearly nine million loan applicants found that lenders are more likely to override automated underwriting system recommendations to deny a minority applicant and to override a negative recommendation to approve a white applicant. Excess denials, while only about 1-2%, were attributed to unobservable characteristics (Bhutta, 2022). Qualitative comments collected during the study suggested potential disparities in the treatment of minority groups, affecting data quality, with references to ‘incomplete application’ or issues with ‘verification’ more likely for Asian and Hispanic groups. Branch availability in areas where certain groups reside may also contribute to these disparities.

In the rapidly evolving landscape of artificial intelligence (AI), the importance of managing data quality cannot be overstated. As Ostmann (2021) points out, it’s often relegated to the background, but it’s a cornerstone of ethical AI. Transparency and explainability hinge on understanding the quality of the data being utilised. To uphold these principles, clear standards and guidelines for assessing and maintaining data quality in AI systems are imperative. Robust data governance frameworks must be established to ensure that the data powering these systems is not only accurate and representative but also free from biases. This necessitates regular audits and evaluations of data sources to identify and rectify any discrepancies or omissions.

[Data quality ] is often relegated to the background, but it’s a cornerstone of ethical AI.

Moreover, in the context of data linkage, open dialogues between data providers and analysts are essential to comprehensively understand how data quality and linkage errors might impact outcomes (Gilbert, 2018). By prioritising data quality within AI initiatives, we pave the way for more trustworthy, accountable, and ultimately ethical AI applications.

BIBLIOGRAPHY

• Bhopal, R. et al. (2010). Cohort Profile: Scottish Health and Ethnicity Linkage Study of 4.65 million people exploring ethnic variations in Bhopal, R. et al. (2010). Cohort Profile: Scottish Health and Ethnicity Linkage Study of 4.65 million people exploring ethnic variations in disease in Scotland. International Journal of Epidemiology , 40(5), 1168–1175.

• Bhutta, N. et al. (2022). How much does racial bias affect mortgage lending? Evidence from human and algorithmic credit decisions. Washington , D.C.: Federal Reserve Board: Finance and Economics Discussion Series.

• Chin, M. et al. (2023). Guiding principles to address the impact of algorithm bias on racial and ethnic disparities in health and health care. JAMA Network Open , 6(12).

• Gilbert, R. et al. (2018). GUILD: Guidance for information about linking data sets. Journal of Public Health , 40(1): 191–198.

• Hagger-Johnson, G. et al. (2015). Identifying possible false matches in anonymised hospital administrative data without patient identifiers. Health Services Research , 50(4), 1162–1178.

• Hagger-Johnson, G. et al. (2017). Probabilistic linking to enhance deterministic algorithms and reduce linkage errors in hospital administrative data. Journal of Innovation in Health Informatics , 24(2): 891.

• Karvonen, K. & Bardach, N. (2024). Making lemonade out of lemons: an approach to combining variable race and ethnicity data from hospitals for quality and safety efforts. BMJ Quality and Safety , 33(2).

• Lariscy, J. (2011). Differential record linkage by Hispanic ethnicity and age in linked mortality studies: Implications for the epidemiologic paradox. Journal of Aging and Health , 23(8), 1263-1284.

• Martinez, E. (2021). The secret bias hidden in mortgage-approval algorithms. Retrieved from AP News: apnews.com/article/lifestyle-technology-business-race-and-ethnicitymortgages-2d3d40d5751f933a88c1e17063657586

• Neter J, et al. (1965). The effect of mismatching on the measurement of response error. Journal of the American Statistical Association , 60(312), 1005-1027.

• Ostmann, F. (2021). AI in financial services . London: The Alan Turing Institute.

DISCLAIMER:

The views expressed in this article are solely those of the author and do not necessarily reflect the opinions or views of any employer, organisation, or institution associated with the author. The author retains full responsibility for the content presented herein.