19 September 2019

Predictive Analytics, Public Services and Poverty

by Eleanor Shearer

Artificial Intelligence and the Transformation of Government

Artificial Intelligence (AI) technologies are set to transform the way that governments deliver public services. Virtual assistants that answer citizen queries; sentiment analysis systems that track public reactions to government policies; tools that can automatically sort vast numbers of government files by topic; and facial recognition software that the police can use to identify those with outstanding arrest warrants, are all examples of AI technologies currently used to try and make governments more efficient and more responsive to citizens’ needs.

One key area of AI in which governments are showing increasing interest is predictive analytics – the use of AI to predict future outcomes based on historical observations. Computers can trawl through vast amounts of data to find hidden patterns, identifying links between particular factors and increased likelihood of a particular outcome – for example, a crime occurring, or a patient in a hospital responding to treatment. Making more comprehensive and more accurate predictions is a worthy goal for public servants to have, but some research suggests that predictive analytics might unfairly target poor and vulnerable citizens, because of biases in the available data on which these new tools are trained and deployed.

Data and Poverty

Last year, Associate Professor of Political Science at the University at Albany Virginia Eubanks’ book Automating Inequality: How High-Tech Tools Profile, Policy, and Punish the Poor made an important contribution to the growing field of AI ethics and algorithmic fairness. In the book, Eubanks explores how new automated systems used in public services across the United States unfairly target poor and vulnerable citizens. She looks at three particular case studies: the automation of eligibility processes for public assistance programmes in Indiana; a system to match homeless people with public housing resources in Los Angeles; and a predictive risk model used by child protective services in Allegheny County, Pennsylvania.

In discussing the Allegheny County case, Eubanks makes a point that is of particular interest to all governments, local and national, that want to harness the power of new technologies to improve the way they deliver public services. She points out that one of the problems with the Allegheny County risk model is that it often relies on the parents it assesses having accessed public services. For example, a parent with a history of drug addiction, or with a history of mental illness, will receive a higher risk score than someone with a record of neither. To assess these factors, the risk model uses government databases containing patient records from addiction centres or from mental health treatment centres. This means that if someone is wealthy enough to have paid for private treatment, they will not have their addiction or illness featured in their risk score.

Eubanks raises an important and often overlooked point here about data and poverty. In international development, people refer to ‘data poverty’ or ‘data deprivation’ – where the poorest countries in the world do not collect adequate data about their citizens, meaning some of the world’s poorest people are not represented in databases about development. Data deprivation suggests that the problem for poor people in a data-driven age is that their data is not collected by governments, rendering them invisible to data analysis tools. However, especially in developed countries like the UK and the US the problem may not be invisibility, but hypervisibility. Wealthier citizens may opt for private alternatives to public services like healthcare, especially when those services come under strain. Meanwhile, poorer citizens claiming for means-tested benefits are subject to additional scrutiny and data collection that those who do not require government assistance can avoid. All of this means that governments may end up with more data on poorer citizens than on wealthier ones.

Furthermore, Eubanks’ case studies indicate that governments may be able to collect more data, including incredibly personal and sensitive data from some of the most vulnerable public service users, and may end up with more rights over that data. Those who rely on the government for life-saving welfare payments, Medicare, or access to housing, will put up with more intrusive terms and conditions on data-gathering to access these benefits, that someone for whom government assistance matters less. Add to this the fact that public hysteria about benefit fraud and the myth of the ‘undeserving poor’ means governments often want to collect vast amounts of personal data from those accessing welfare, to ensure that their needs are real.

The rights that poor and vulnerable people have over their data may also be shaky if signing those rights away is (or seems to be) a prerequisite for accessing public services. One of the cases Eubanks discusses is a tool for matching homeless people in Los Angeles with public housing depending on their needs. To be in with a chance of accessing public housing, homeless people must fill in a long survey with an outreach worker, providing personal details including whether they suffer from mental illness; whether they have accessed emergency services for sexual assault or domestic violence; whether they have had sex for money or run drugs for someone; and whether they have attempted self-harm or attempted to harm others. The consent form for the survey suggests that data will be shared with ‘organizations [that] may include homeless service providers, other social service organizations, housing groups, and healthcare providers,’ and states that a full privacy notice can be provided on request. If anyone were to request the full privacy declaration, they would find that the data is shared with 168 organisations, including the Los Angeles Police Department, who are able to access personal data in the system without a warrant. As Eubanks notes, it is hard to imagine a database of those receiving mortgage tax deductions or federally subsidised student loans being accessible to law enforcement without a warrant.

Algorithmic Bias

As AI becomes an ever-increasing presence in our lives, the issue of bias or unfairness in these new technologies is attracting more public attention. The stakes are especially high when governments could end up using technologies that perpetuate existing social inequalities. The powers that the government wield over its citizens (powers like the right to arrest someone and deprive them of their liberty, or the right to take away someone’s child) mean that when governments use tools that are unfair – such as facial recognition tools that are more likely to misidentify black faces – the consequences for citizens can be severe. The issue of algorithmic fairness is critical in the future of public services.

It is therefore concerning that the discrepancy between the data that governments have from poorer citizens compared to wealthier ones affects the fairness of predictive analytics. This bias can occur in two ways. Firstly, tools trained on historical data that over-represents poor people is likely to make predictions that are skewed. In her book on the future of AI in society, mathematician Hannah Fry highlights how algorithms developed for predictive policing can end up targeting particular areas in a self-reinforcing loop. If certain (often poor, and often majority-BME) neighbourhoods are flagged as high risk of crime due to being historically overrepresented in the data on previous crimes (whether due to a genuine increased risk of crime, discriminatory over-policing of poor people and people of colour, or a combination of the two), then the police will send more officers to these areas. An increased police presence is likely to lead to officers identifying more crime in that area – meaning that the initial inequality only gets further entrenched when the system receives new data that marks these neighbourhoods as even riskier than before. This is one way in which AI systems can end up targeting poor people unfairly due to discrepancies in the underlying data.

The second problem is that, when predictive models are applied, poor people are more likely to be flagged for certain risk factors if the government has more data on poorer citizens. For example, as discussed above, the government has data on citizens who have accessed public treatment for addiction, but none for those accessing private treatment. This is the issue that Eubanks flags with the Allegheny County risk model for child abuse and neglect. Highlighting the disparity caused by the risk model (what she calls poverty profiling), she writes:

Professional middle-class families reach out for support all the time: to therapists, private drug and alcohol rehabilitation, nannies, babysitters, afterschool programs, summer camps, tutors, and family doctors. But because it is all privately funded, none of those requests ends up in Allegheny County’s data warehouse. The same willingness to reach out for support by poor and working-class families, because they are asking for public resources, labels them to their children in the [predictive model].

(p. 166)

In this way, even when a model has outlined a reasonable predictive pattern (such as a link between drug addiction and child neglect), it does not treat all cases equally. Poor individuals are more likely to be targeted (and also more likely to be targeted after reaching out for treatment, when they are likely trying to do better for themselves and their child) than wealthy individuals.

Better Data or Better Data Rights?

The problem with algorithmic bias in these cases of predictive analytics is – as in almost all cases – a problem of the underlying data that models can work with. If one group gets over- or under-represented in the data, the algorithm will end up being biased. The racial bias that has been observed in facial recognition software, for example, is largely due to training data that contains far more pictures of white than non-white people. Another example would be the automatic CV screening tool that Amazon allegedly had to pull because it was biased against women. The system was trained on the people previously hired by the company – which, due to sexist perceptions about the role of women in tech – tended to be men.

However, what makes this predictive analytics case especially challenging is that the solution to the data problem is less straightforward that in these other cases of AI unfairness. The proposed technical fix in these other cases is to go out and collect more data from underrepresented minorities or marginalised groups, in order to make the dataset more diverse. However, the government often cannot go and collect more data from wealthier citizens to fix biased predictive tools, especially when the data it has on poorer citizens is so intrusive. Most people would refuse to surrender the information that they are having sex for money, or that they suffer from a drug addiction, usually out of a legitimate concern for their privacy. The reluctance of wealthier citizens to share this kind of information with the government raises questions about whether it is right to gather such intrusive data about poor and vulnerable citizens in the first place, let alone extend the collection of this data to encompass the entire population.

This leaves governments in a tricky position – predictive analytics presents an unparalleled opportunity to improve many public services, but in its current form may be biased. The solution to balancing the need for fairness with the drive to make governments better at helping citizens might, in this case, not be better datasets, but better data rights – and especially better data rights for those often least able to advocate for them.

The movement for data rights has grown stronger in recent years, and legislation such as the EU’s GDPR aim to enshrine individuals’ data rights in law. It is essential that we recognise how access to these rights could be threatened by certain kinds of disadvantage. For example, the GDPR does set a high bar for consent, and also advises that public authorities should avoid making consent to the processing of personal data a precondition of a service. However, in practice, this recommendation will have to contend with deep-rooted cultures of suspicion of welfare recipients, that pushes governments to collect as much data as possible in order to minimise possible fraudulent claims. The GDPR also enshrines rights to erasure of personal data (also known as ‘the right to be forgotten’), but allows latitude for governments to apply certain ‘reasonable’ requirements such as an administrative fee to process the request if it will be onerous to complete, as well as proof of identification. These requirements could be barriers to the most vulnerable citizens lobbying for the erasure of their sensitive data, leaving them with digital shadow that could follow them for years (if not for life).

In many cases, the incentives to improve predictive analytics and the incentives to improve data rights for everyone will conflict. If poorer citizens gain better access to and understanding of their rights, governments may not be able to collect data on as broad and intrusive a scale. This will necessarily limit the scope of predictive analytics – and this may well mean that in some cases we catch fewer criminals, or detect fewer cases of child abuse. However, we must ask ourselves – what price, as a society, are we willing to pay for better prediction? After all, one way to reduce crime would be to imprison arrested on suspicion of a crime, and one way to reduce child abuse would be to take away all children from parents that are suspected of abuse. The fact that we do not take this approach is evidence of the value we place on fairness – we are willing to risk some criminals or abusers going free so that innocent people are not unjustly imprisoned or do not wrongly have their child taken away from them. Sacrificing the rights of disadvantaged people for the sake of ‘better predictive analytics’ would do a disservice to this existing commitment to fairness and justice.

Conclusion

The case of unfairness in government uses of predictive analytics teaches us two things. First, there is a need to bring the conversations about data rights and algorithmic unfairness together. The answer is not always just to collect better and more representative data – sometimes, we face difficult questions about the balance between the social uses of new technologies and the rights of the citizens on whose data they rely. Moving the debate on the issue of data to the question of rights and of justice is important, even in the cases where ‘better data’ is currently being touted as the technical fix for unfairness. In the case of facial recognition, while many campaigners want better datasets that reflect the full diversity of human faces, others have highlighted the effects that facial recognition could have on the harmful policing of people of colour, even if it is perfectly accurate. Better data is not always the best way to protect citizens from harm in the new AI age.

Second, we must ensure that data rights are not just enshrined in law, but actively made accessible to everyone. Sometimes those least able to advocate for their data rights will be those that need them most, such as those without the money to pay an administration fee for data erasure, or those who fear (rightly or wrongly) that refusing to give their data to governments will make them seem like they have something to hide, thereby threatening their access to welfare or benefits. Making data rights truly accessible may therefore require a cultural shift in the way we view those requesting public assistance – with the default being that they are trustworthy and in genuine need, not that they are ‘benefit scroungers’ or ‘welfare queens’.

Predictive analytics certainly has a valuable role to play in the future of government services. However, AI cannot tell us the truth about the world; it can only tell us the truth about the data we have about the world. Currently, many poor and vulnerable citizens are subject to intrusive data collection when they access public services, and this threatens the fairness of predictive analytics systems trained or used on this data. Governments must think carefully about how best to balance improved predictions about important public issues such as crime and child abuse with the rights of all its citizens, wealthy and poor.

Insights

More insights

21 April 2017

Why Government is ready for AI

12 July 2017

Five levels of AI in public service

26 July 2017

Making it personal: civil service and morality

10 August 2017

AI: Is a robot assistant going to steal your job?

19 September 2017

AI and legitimacy: government in the age of the machine

06 October 2017

More Than The Trees Are Worth? Intangibles, Decision-Making, and the Meares Island Logging Conflict

16 October 2017

The UK Government’s AI review: what’s missing?

23 October 2017

Why unconference? #Reimagine2017

03 November 2017

AI: the ultimate intern

09 November 2017

Motherboard knows best?

23 November 2017

Beyond driverless cars: our take on the UK’s Autumn Budget 2017

05 December 2017

Why Black people don’t start businesses (and how more inclusive innovation could make a difference)

06 December 2017

“The things that make me interesting cannot be digitised”: leadership lessons from the Drucker Forum

23 January 2018

Want to get serious about artificial intelligence? You’ll need an AI strategy

15 February 2018

Economic disruption and runaway AI: what can governments do?

26 April 2018

Ranking governments on AI – it’s time to act

08 May 2018

AI in the UK: are we ‘ready, willing and able’?

24 May 2018

Mexico leads Latin America as one of the first ten countries in the world to launch an artificial intelligence strategy

05 July 2018

Beyond borders: talking at TEDxLondon

13 July 2018

Is the UK ready, willing and able for AI? The Government responds to the Lords’ report

17 July 2018

Suspending or shaping the AI policy frontier: has Germany become part of the AI strategy fallacy?

27 July 2018

From open data to artificial intelligence: the next frontier in anti-corruption

01 August 2018

Why every city needs to take action on AI

09 August 2018

When good intentions go bad: the role of technology in terrorist content online

26 September 2018

Actions speak louder than words: the role of technology in combating terrorist content online

08 February 2019

More than STEM: how teaching human specialties will help prepare kids for AI

02 May 2019

Should we be scared of artificial intelligence?

04 June 2019

Ethics and AI: a crash course

25 July 2019

Dear Boris

01 August 2019

AI: more than human?

06 August 2019

Towards Synthetic Reality: When DeepFakes meet AR/VR

10 January 2020

To tackle regional inequality, AI strategies need to go local

20 April 2020

Workshops in an age of COVID and lockdown

10 September 2020

Will automation accelerate what coronavirus started?

10 September 2020

Promoting gender equality and social inclusion through public procurement

21 September 2020

The Social Dilemma: A failed attempt to land a punch on Big Tech

20 October 2020

Data and Power: AI and Development in the Global South

23 December 2020

The ‘Creepiness Test’: When should we worry that AI is making decisions for us?

13 June 2022

Data promises to support climate action. Is it a double-edged sword?

30 September 2022

Towards a human-centred vision for public services: Human-Centred Public Services Index

06 October 2022

Why You Should Know and Care About Algorithmic Transparency

26 October 2022

Harnessing data for the public good: What can governments do?

09 December 2022

Behind the scenes of the Government AI Readiness Index

06 February 2023

Reflections on the Intel® AI for Youth Program

01 May 2023

Canada’s AI Policy: Leading the way in ethics, innovation, and talent

15 May 2023

Day in the life series: Giulia, Consultant

15 May 2023

Day in the life series: Emma, Consultant

17 May 2023

Day in the life series: Kirsty, Head of Programmes

18 May 2023

Day in the life series: Sully, Partnerships Associate/Consultant

19 May 2023

LLMs in Government: Brainstorming Applications

23 May 2023

Bahrain: Becoming a regional R&D Hub

30 May 2023

Driving AI adoption in the public sector: Uruguay’s efforts on capacity-building, trust, and AI ethics

07 June 2023

Jordan’s AI policy journey: Bridging vision and implementation

12 June 2023

Response to the UK’s Global Summit on AI Safety

20 June 2023

 Unlocking the economic potential of AI: Tajikistan’s plans to become more AI-ready

11 July 2023

Government transparency and anti-corruption standards: Reflections from the EITI Global Conference in Dakar, Senegal

31 August 2023

What is quantum technology and why should policymakers care about it?

21 September 2023

Practical tools for designers in government looking to avoid ethical AI nightmares

23 October 2023

Collective Intelligence: exploring ‘wicked problems’ in National Security

23 October 2023

Exploring the concepts of digital twin, digital shadow, and digital model

30 October 2023

How to hire privileged white men

09 November 2023

Inclusive consensus building: Reflections from day 4 of AI Fringe

13 November 2023

AI for Climate Change: Can AI help us improve our home’s energy efficiency?

14 November 2023

Navigating the AI summit boom: Initial reflections

20 November 2023

AI for Climate Change: Improving home energy efficiency by retrofitting

24 November 2023

Will AI kill us all?

27 November 2023

AI for Climate Change: Preventing and predicting wildfires 

28 November 2023

Service Design in Government 2023: conference reflections

04 December 2023

AI for Climate Change: Using artificial and indigenous Intelligence to fight climate change

06 December 2023

Release: 2023 Government AI Readiness Index reveals which governments are most prepared to use AI

11 December 2023

AI for Climate Change: AI for flood adaptation plans and disaster relief

18 December 2023

AI for Climate Change: Managing floods using AI Early Warning Systems

28 May 2024

Strengthening AI Governance in Chile: GobLab UAI and the Importance of Collaboration

05 June 2024

Towards a Digital Society: Trinidad and Tobago’s AI Ambitions

17 June 2024

General election 2024 manifestos: the AI, data and digital TLDR

26 June 2024

Building Egypt’s AI Future: Capacity-Building, Compute Infrastructure, and Domestic LLMs

16 July 2024

Beyond the hype: thoughts on digital, data, and AI and the first 100 days