The Algorithmic is Political: An Interview with Dr Annette Zimmermann

Annette Zimmermann is an analytic political philosopher and ethicist at Princeton University’s Center for Human Values (UCHV) and Center for Information Technology Policy (CITP). Annette’s current work explores the ethics and politics of algorithmic decision-making, machine learning and artificial intelligence.

Here is a shortened version of our fascinating conversation (AZ: Annette Zimmermann, LS: Leonie Schulte)

LS: What are you working on right now? What are some of your future projects that you are excited about?

AZ: Right now, I am working on several academic papers on the concept of algorithmic injustice. I am interested in the scope of that concept: I think that it’s important to understand what kind of problem we are dealing with and how big of a problem it is, in comparison to other moral and political problems, because our answer to that question will determine what kind of answers we can give about morally and politically justifiable solutions. In my work, I think about questions like: what type of injustice are we talking about when we talk about algorithmic injustice? Is it a unique form of injustice, or just a reiteration of other, more familiar forms of non-technological injustice that already shape society? I am also interested in the question of how algorithmic bias develops and compounds over time, which in my view is an extremely underexplored area.

I am also working on a short book titled The Algorithmic is Political. The book is about the extent to which technological design and deployment decisions have this political and moral baggage that we talked about before, and what that means for our moral and political rights and duties: who is responsible for when things go wrong? Who should fix things like algorithmic bias, particularly when it compounds historical patterns of injustice? Are there some decisions that we simply should not automate, even if we have highly accurate and efficient technological tools? Who has a right to have a say: who should be involved in decisions about that, and how should those decisions take place given existing political and social structures? What would it mean to democratize AI, and would democratizing AI solve problems like algorithmic injustice?

LS: At EgdeRyders, we like to learn about what motivates experts to work in their respective fields: Can you tell me a little bit about what brought you into this line of research?

AZ: I have always been interested in topics related to democracy, justice, and the moral and political significance of risk and uncertainty. Before starting my current research project on the ethics and politics of artificial intelligence and machine learning, I was thinking about decision-making and justice more broadly: about the rights that people have when democratic constituencies make decisions together, particularly when different people bear various levels of risk of being affected by those decisions. Then I realised that algorithmic decision-making models often lead to a specific instantiation of this complex, larger problem that appears over and over again in political philosophy: in a democratic society, in which we are committed to treating people as equals, how should we distribute risks between people?

LS: Why do we need to think about the ethics of artificial intelligence and algorithmic decision-making? In concrete terms, what does ethical reasoning about emerging technologies involve?

AZ: Right now, a lot of people—including those working in the tech industry—are recognizing the need for ethical reflection on the social implications of AI: we are in an ‘AI ethics gold rush’ moment. But how does one actually do AI Ethics? I think that when people hear ‘ethics’, they often think of it as basically being the same thing as reasoning based on personal beliefs and opinions, religious views, or cultural norms. Many people therefore think that the answers to all ethical questions—that is, questions about what we ought to do, about what it means to do the right thing —are ultimately relative: that there is no obvious way of sorting out whether one ethical view is better or worse than another view. I think that that is a dangerous way of thinking about ethics—including AI ethics—because I think that there are at least some value-based arguments that are clearly incoherent or that lead to deeply morally repugnant conclusions. We don’t have to treat all ethical arguments as equally persuasive by default. Of course, there will still be many competing ethical arguments that are not obviously flawed in that way, so complicated trade-offs may arise. For instance, in the context of AI, we might really care about accuracy, but we also really care about justice, and about efficiency, and about transparency, privacy, and so on. All of these are important values and goals—but it is not enough to simply come up with a ‘wish list’ of values, because it is often impossible to optimize for all of these important values at the same time. We have to make ethical judgment calls about which values to rank more highly, and we have to be clear about which values we have failed to realise when we do so. Doing ethics well means getting comfortable with that sense of uncertainty surrounding these trade-offs: there is a risk that we might get things wrong, and a good way of confronting that risk is to keep asking questions, rather than committing to one ethical argument at one point in time (a view like: “Irrespective of the domain of application, always prioritize accuracy over transparency!”) and then trying to just apply it without ever revisiting it. In other words, it’s dangerous to declare the ethical case closed. Whenever we do ethics—and that ‘we’ includes not just political and moral philosophers, but also tech practitioners, policy-makers, and ordinary citizens—we must remain open to the possibility that we have been making the wrong choices so far, and that we need to change our actions going forward, rather than simply trying to make whatever we have been doing so far more efficient. I think that one central dimension of this in the context of AI ethics will be to prioritise the perspective of those most directly and most negatively affected by the use of algorithmic decision-making tools that exhibit racial or gender bias, as well as other forms of biases that exacerbate social inequality.

LS: In popular culture and in public discourse, AI is often portrayed as a kind of looming entity, which is both inevitable and out of our control. People seem to worry about the ways in which we ought to adapt to the emergence of powerful new technologies: how will AI change what it means to be human? But it also seems important to turn that question on its head: how should humans shape what AI does and does not do? Which consequences of AI innovation are truly inevitable and where do human decision-makers have room to make deliberate, conscious choices about how AI affects our social life? How does your work respond to these complex questions?

AZ: I think that public discourse on this issue tends to split into two fairly extreme views, neither one of which is correct. On the one hand, there’s ‘AI optimism: the view that the increasingly ubiquitous use of AI is inevitable, that we can’t return to not using AI once we have deployed it in a given domain. The implicit assumption behind that view often seems to be something like, “well, if AI is inevitable, there isn’t really a point in trying to critique it ethically”—or to critique it politically, even. On this view, AI is nothing more than a tool—and one that promises to have a decidedly positive impact on society. On the other hand, there is a dramatically opposing view that says something like, “it is inevitable that all AI will lead to incredibly bad and harmful consequences”. That’s a different sense of inevitability right there. Tech pessimists seem to think that whatever we do, whichever domain we focus on in our AI deployment, the use of automated reasoning methods will always be somehow counterproductive or harmful.

I think both views are wrong. The first view underestimates our ability to subject AI to productive scepticism, public oversight, and collective control. Even if it is inevitable that we will be using more and more AI in our collective decision-making processes, we ought to keep reflecting critically on AI’s purpose. There might be some collective decision-making problem domains where it would be ethically and politically inadequate to use AI because we intuitively feel that when things go wrong in those domains, we want a human decision-maker to blame; we want somebody to justify to us why a particular decision outcome was reached. In those domains, we are often quite happy to not be maximally efficient or accurate. What we really want is to think and argue with each other about how best to arrange our political and social life, about what it would mean to have a just society, about how we can relate to each other as true equals. These are ultimately not just ethical, but fundamentally political questions. Of course, tech optimists might reply to the response that I just gave: “but clearly AI will optimize our collective decision-making!” To that, I think a more nuanced and sceptical view might say in response, “well, optimising isn’t everything”. We must exercise our critical judgment to determine when optimising further and further, and thereby entrenching and accelerating our current social status quo, is actually what matters most. And in many cases, that might not be what justice requires.

But at the same time, the second view that I outlined before—the tech pessimists’ view—can also be misleading. It is misleading in the sense that it equates all forms of AI and machine-learning, including their ethical importance and impact, with each other. I do not think that that is plausible at all. Not all forms of AI will lead to the same kinds of harmful outcomes. It is especially important to ask: where and how can we use AI to mitigate injustice rather than exacerbating it? Nuanced ethical and political engagement with new technologies requires avoiding sweeping generalisations. But, it is clear that, as a general principle, if using algorithmic decision-making in a particular decision domain moves us further away from justice, then we have a strong presumption against using it in that domain.

LS: Could we make AI more just by improving algorithms on a technological level, for example by defining mathematically what a fair algorithm would look like?

AZ: Computer scientists and statisticians (especially those working in a new, interdisciplinary research subfield called FATML—Fairness, Accountability and Transparency in Machine Learning) have done a lot of important work recently. Amongst other things, FATML researchers have thought critically about whether simply distributing the same probabilities of obtaining a particular outcome across different socio-demographic groups is actually sufficient for an algorithmically fair procedure. The trouble is that there are multiple plausible mathematically defined fairness metrics, and we cannot optimise for all of them at the same time—they are mathematically incompatible. Furthermore, as many FATML researchers have pointed out, algorithmic models will always interact dynamically with the world.

This sounds simple and intuitive, but it has dramatic implications for how we can and should tackle problems like unfair algorithmic decision outcomes. Assume that we manage to develop an algorithm that actually does adhere to whatever broadly plausible mathematical fairness desiderata we have articulated in the design stage, and then we deploy it in the world. But, unsurprisingly, the world is shaped by a history of inequality and injustice, and technological models interact dynamically with the social world. So, whatever new data gets fed back into our algorithmic model will reflect those inequalities in our status quo to some extent. So then, a decision rule within that algorithm that seemed ostensibly fair in the beginning, may not ultimately deliver truly equitable results. One consequence we might draw from this insight that models interact dynamically with the world is to say, “well, it’s not decisive how well we mathematically articulate what the algorithm is supposed to be doing and in what way it’s supposed to be doing it, if there is always going to be this dynamic interaction.”

That would be a bad conclusion for us to reach, I think. To address the model-world interaction problem in a more meaningful way, we must ask: what is the purpose of using algorithmic decision making for a given problem in the first place? What—and indeed whom—are we defining as a problem? What sort of data are we trying to look at, and does that data really help us solve the problem? Which populations are most affected by technological innovation, both positively and negatively? In sum, we need to take a broader, socially and politically informed view—not a purely technological view—on algorithmic fairness.

LS: How can political philosophy help push us to think critically about AI and machine learning?

AZ: I think that one particularly useful resource in political philosophy is the long tradition of thinking about theories of justice. These are theories about how we should distribute benefits and burdens in society, but also about how we should relate to each other. So, it’s not only about allocating goods and benefits to people, but also about thinking about the social relations and structures that we ought to create and maintain: what it means to respect each other as equals, and what we owe each other on an interpersonal level. What does it mean to treat each citizen in our society with respect, given that there are going to be some inequalities that we possibly can’t eliminate? For example, we can’t change the fact that people are born with different natural talents and abilities, but most of us think that your access to basic resources and basic rights shouldn’t depend on those ‘undeserved’ features that you are born with. In addition, it is also plausible to think that if an individual or group is structurally disadvantaged in society, we should prioritise the interests and the voices of members of that group in processes of collective decision-making in some form. So, if we are going to have inequalities in our society, those inequalities ought to be arranged in a way that benefit those who are worst off the most. Many analytic political philosophers, especially those working in the 20th century, have been thinking about that question. In addition, feminist philosophers and critical race theorists have called our attention to the idea that we should not treat the social and political status quo as just by default: justice is not only about ensuring compliance with the rules governing our existing social institutions—it is also about transforming those institutions as a whole, and about being accountable for historical patterns of injustice.

In addition to theories of justice, another line of thought in political philosophy that is relevant for AI is the debate on disagreement and dissent in democratic societies. Thinking about disagreement is, I think, very important when we think about design decisions linked to AI and machine learning. To give a concrete example, suppose that we are using algorithmic tools to predict the risk that a criminal offender will reoffend in the future. One such algorithm is called COMPAS—a tool that various states in the US having been using for a few years now—to determine whether an offender is ‘high risk’ or ‘low risk’, and that according to a well-known investigation by ProPublica, it was more likely to classify Black defendants as ‘high risk’ in comparison to similar white defendants. Of course, what it means to be a ‘high risk individual’ is subject to disagreement—whether those who design algorithmic tools explicitly acknowledge that or not. Algorithms are not inherently neutral, because we still have to make conscious choices when we design systems, including choices influenced by implicit bias and stereotypes. Of course, we could try to approximate what riskiness in this context means by looking at things like: has this person committed previous offences? Is she in an unstable economic situation? Do her social networks include a lot of other people who recidivate? Some of those variables might be more appropriate to use than others, right? For instance, one might think it’s inappropriate to make inferences from the behaviour of your friends and family whether you in fact will be guilty of criminal conduct again in the future. Some predictions might be more ethically and politically defensible than others. Similarly, we have to be attuned to the fact that even if you have a list of prior offences, that might be partially dependent on your social context, which we may want to consider as a mitigating factor. But it is often difficult to do just that when using algorithmic decision-making instead of human judgment: using tools like COMPAS doesn’t allow us to take context into account sufficiently, and it simultaneously obscures the extent to which underpinning concepts in our algorithmic model are subject to wide social and political disagreement.

LS: Would it be right to say that disagreement is embedded in even the earliest stages of design?

AZ: Yes. Consider the use of algorithmic decision-making in credit scoring or in hiring. When designing algorithmic tools, for this purpose, we have to come up with a way of measuring who counts as ‘a good employee’ or as ‘creditworthy’. But those are not straightforward, unambiguous, and uncontroversial concepts. There are many ways of cashing them out, and if we do not think hard about what kinds of concepts we should be working with, bad things can happen. Amazon, for instance, was in the news recently because they used an algorithmic hiring tool that ended up having a disparate impact that disadvantaged women applicants, because the majority of applicants who had been successful at Amazon so far had been male. Suppose, for example, that an algorithmic hiring tool takes into account the number of years of coding experience for each applicant. Boys tend to be encouraged to start coding earlier in life than girls—but the amount of years of coding experience might be differently predictive for different genders in terms of future success as a ‘good employee’ in the tech industry. So, a female Amazon applicant with five years of coding experience might be just as well-suited as a male applicant with ten years of coding experience, if their current skill set is on par—but if the algorithm ranks ‘years of coding experience’ much more highly than ‘current skill set’, the algorithm will keep favouring male applicants.

The bottom line is: concepts always have moral and political baggage. If we fail to engage with that problem, we risk developing technology that becomes less and less representative of what the world is really like, in all its complexity. We risk ending up with technology that isn’t merely inaccurate, in the sense that it doesn’t faithfully represent the real world, but also with technological tools that fail differently for different people.

LS: Earlier, you mentioned that the wider public might not necessarily be fully informed about what algorithmic decision-making and AI, in their current form, actually are. What are some of the most crucial areas about which the public is mis-or underinformed?

AZ: I think that one misconception is that a lot of people think of AI as something akin to the highly sophisticated, autonomous artificial agents that we see in dystopian sci-fi movies: those kinds of ‘general’ AI would be able to operate across a whole range of different tasks, and would not be dependent on humans pre-defining decision problems for them. That’s not the kind of AI we are dealing with now, or even in the foreseeable future. Right now, we are using domain-specific tools (sometimes called ‘narrow AI’) which are far away from being conscious, autonomous agents similar to humans. Irrespective of that fact, humans often attribute some form of agency to technological systems. This is dangerous: whenever the use of algorithmic decision-making leads to unjust outcomes, we can’t exclusively blame the technology. The agents which we have to hold accountable instead are human decision makers—including private corporations who develop technology, and governments who choose to procure and deploy it. And once it becomes well-known that technology leads to unjust outcomes, we also need start thinking of ourselves—regular citizens—as people who are responsible for addressing problems like algorithmic injustice together, in the arena of democratic decision-making. Without that, I think citizens are not truly living up to their collective responsibility of safeguarding, and indeed demanding, equal freedom and justice for all.

LS: In your recent essay, “Technology Can’t Fix Algorithmic Injustice” , published in the Boston Review earlier this year, you make a bold claim: “there may be some machine learning systems that should not be deployed in the first place, no matter how much we can optimise them”. That argument stands in clear contrast to some common ideas and aspirations in the tech industry, which rely on a sense of limitless potential, efficiency, and continuous optimisation in relation to AI. What factors contribute to making some machine learning systems unfit for deployment, and perhaps even dangerous for society?

AZ: I think that one good example of the so-called non-deployment argument – the argument that we just shouldn’t deploy particular types of AI systems in particular domains to solve particular problems, no matter how much we can optimise AI in that domain – is the current debate about facial recognition technology, which is used in law enforcement, for instance. What many people do not know is that fifty percent of American adults are currently in a facial recognition database that law enforcement has access to. These databases don’t only include mugshots, they also include images from drivers’ licenses. So, a huge part of the population is already assessable by facial recognition systems, here and now. In several American cities, including San Francisco, activists argued that they did not want their cities to deploy these tools, and subsequently, these cities decided to ban them. One primary concern was not only the fact that so many people are represented in those databases without having clear recourse mechanisms, so that there isn’t a clear due process path for people who get assessed inaccurately by those facial recognition tools. A potential for mass surveillance obviously affects all citizens, but there is also a more specific, injustice-based concern that these kinds of tools have been empirically shown to be less accurate for people of colour. If you are already disadvantaged in society by structures of racial injustice, then the use of facial recognition tools will oppress you further.

LS: How should tech and policy practitioners approach AI ethics?

AZ: Not everybody in the tech industry has to get a PhD in philosophy—but practitioners need to start seeing themselves as people who are already unavoidably making decisions of ethical and political significance. There is no such thing as a ‘purely technological issue’—technological choices are to some degree always political and ethical choices. Also, I think it’s important for practitioners to bear in mind that ethical reflection is never a process that is done unilaterally, and it’s never a process that is done, so to speak. We shouldn’t approach AI ethics as something like a checklist. We should not think of ethics as a final ‘quality control’ check that we simply run once after we’ve developed a particular technology. We should have ethical goals in mind before we deploy, but we should also be very open to changing those ethical goals once we do deploy and once we see what effects that technology has in the world. Recall the problem of model-world interactions that I mentioned before: because of it, we might simply not foresee certain problems, we might not be able to accurately forecast the magnitude of problems, we might not be able to predict how different technological systems interact with each other, and we may not be able to accurately predict how humans change their behaviour once they know they are subject to algorithmic assessments. All of these problems require that we keep our ethical critique going, rather than approaching it as a static, one-time process.


Wow, this is a really great post! So much, in fact, that I feel the need to break it down. In what follows I only focus on the first part of the interview: understanding AI ethics (more or less the first four questions).

This formulation reminded me of medicine. Clinical doctors are not scientists: they have a different lineage. Their commitment is not to vanquishing the ilness, but to the well-being of the patient. This entails navigating the exact same trade-offs that Zimmermann talks about here. You see this very well in the COVID-19 debate: epidemiologists tend to be closer to the science side of things. They build and interpret models, do statistics and so on. Their recommendations optimize for indicators like caseload, hospitalizations and so on. Infectivologists are closer to clinical practice, and they tend to strike a balance across many different factors, some of them having nothing to do with medicine and health. I was very impressed by an interview with Johan Giesecke (thanks @hugi for sharing it), who started out as an infectivologist and ended up becoming Sweden’s main epidemiologist. He talks a lot about democracy and economic well-being, you can hear him navigate the tradeoffs as he speaks.

The parallel with medicine is also interesting in that clinical doctors consider “evidence” things that are very different from the “evidence” of their research colleagues. This is fundamental: medical research deals in statistics. It tells you that, in the trial of new drug X, 70% of the people in the treatment group experienced a body weight loss of 3% or more, against only 24% in the control group. What it does not tell you is if drug X will help this patient in particular, right now, get better. @markomanka once told him that technologists treat medicine as if it consisted only of diagnosis: once you get the diagnosis right, the treatment protocol follows. This is completely false (“my diabetes is not like your diabetes”). What’s more, it will always be false, because human health is so multidimensional that statistical reasoning can never “see” the individual case. You would need clinical trials with the whole human population!

And this bring me to the topic, dear to me, of data.

This is true. “Data” are translations of messy reality into a formal structure, operated by biased, overworked, careless, unreliable humans. And that is in the best case scenario. We, however do not live in the best case scenario: we live in a scenario where major governments will tell you, with a straight face, things like “today we report 325 COVID-19 deaths. Of these, 87 happened yesterday in the country’s hospitals; 238 happened in retirement homes, between yesterday and last Tuesday, sorry, there were delays in collecting the data. Ah, and of the people that died in the retirement homes 37% tested positive for COVID, we did not test the others”.

In a situation like this, you do not “crunch” the data. You circle them like you would do with a dangerous quicksand pit. You poke at them with a stick, assume they are unreliable and try to determine how unreliable they are. What you don’t do is feed them to a model, happily publish the result (with the obligatory sprinkling of dubitative expressions) and then protest “but these are the only data we have!” when called out.

Sure. But ML does optimize. If it does not optimize, it’s not ML.

Intuitively, I do not see how narrow AI could be a credible candidate method for making decisions. It can be a way to process information – in medicine, to diagnose. But when it comes to the nit and grit… I don’t know, I guess I am in the skeptics’ camp.