UCL Digital Ethics Forum: Translating Algorithm Ethics into Engineering Practice

Tue 04-Feb-2020

On Tue 04-Feb-2020 my fellow academics at UCL held a workshop on algorithmic ethics. It was organised by Emre Kazim and Adriano Koshiyama, two incandescently brilliant post-docs from UCL. The broader group is run by Prof. Philip Treleaven, who is a living legend in academic circles and an indefatigable innovator with an entrepreneurial streak.

Algorithmic ethics is a relatively new concept. It’s very similar to AI ethics (a much better-known concept), with the difference being that not all algorithms are AI (meaning that algorithmic ethics is a slightly broader term). Personally, I think that when most academics or practitioners say “algorithmic ethics” they really mean “ethics of complex, networked computer systems”.

Artificial hand holding judge scales The problem with algorithmic ethics doesn’t start with them being ignored. It starts with them being rather difficult to define. Ethics are a bit like art – fairly subjective and challenging to define. Off the top of our heads we can probably think of cases of (hopefully unintentional) discrimination of job applicants on the basis of their gender (Amazon), varying loan and credit card limits offered to men and women within the same householdⁱ (Apple / Goldman), online premium delivery services more likely being offered to white residents than blackⁱⁱ (Amazon again). And then there’s the racist soap dispenserⁱⁱⁱ (unattributed).

These examples – deliberately broad, unfortunate and absurd in equal measure – show how easy it is to “weaponise” technology without an explicit intention of doing so (I assume that none of the entities above intentionally designed their algorithms as discriminatory). Most (if not all) of the algorithms above were AI’s which trained themselves off of a vast training dataset, or optimised a business problem without sufficient checks and balances in the system.

With all of the above most of us will just know that they were unethical. But if we were to go from an intuitive to a more explicit understanding of algorithmic ethics, what would it encompass exactly? Rather than try to reinvent the ethics, I will revert to trusted sources: one of them will be Alan Turing Institute’s “understanding artificial intelligence ethics and safety”^iv and the other will be a 2019 paper “artificial intelligence: the global landscape of ethics guidelines”^v co-authored by Dr. Marcello Ienca from ETH Zurich, whom I had the pleasure of meeting in person at Kinds of Intelligence conference in Cambridge in 2019. The latter is a meta-analysis of 84 AI ethics guidelines published by various governmental, academic, think-tank, or private entities. My pick of the big ticket items would be:

Equality and fairness (absence of bias and discrimination)
Accountability
Transparency and explicability
Benevolence and safety (safety of operation and of outcomes)

There is an obvious fifth – Privacy – but I have slightly mixed feelings when it comes to throwing it in the mix with the abovementioned considerations. It’s not that privacy doesn’t matter (it matters greatly), but it’s not as unique to AI as the above. Privacy is a universal right and consideration, and doesn’t (in my view) feed and map to AI as directly as, for example, fairness and transparency.

Depending on the context and application, the above will apply in different proportions. Fairness will be critical in employment, provision of credit, or criminal justice, but I won’t really care about it inside a self-driving car (or a self-piloting plane – they’re coming!) – then I will care mostly about my safety. Privacy will be critical in the medical domain, but it will not apply to trading algorithms in finance.

The list above contains (mostly humanistic) concepts and values. The real challenge (in my view) is two-fold:

Defining them in a more analytical way.
Subsequently “operationalising” them into real-world applications (both in public and private sectors).

The first speaker of the day, Dr. Luca Oneto from the University of Genoa, presented mostly in reference to point #1 above. He talked about his and his team’s work on formulating fairness in a quantitative manner (basically “an equation for fairness”). While the formula was mathematically a bit above my paygrade, the idea itself was very clear, and I was sold on it instantly. If fairness can be calculated, with all (or as much as possible) ambiguity removed from the process, then the result will not only be objective, but also comparable across different applications. At the same time, it didn’t take long for some doubts to set in (although I’m not sure to what extent they were original – they were heavily inspired with some of the points raised by Prof. Kate Crawford in her Royal Society lecture, which I covered here). In essence, measuring fairness seems do-able when we can clearly define what constitutes a fair outcome – which, in many cases in real life, we cannot. Let’s take two examples close to my heart: fairness in recruitment and the Oscars.

With my first degree being from not-so-highly ranked university, I know for a fact I have been autorejected by several employers – so (un)fairness in recruitment is something I feel strongly about. But let’s assume the rank of one’s university is a decent proxy for their skills, and let’s focus on gender representation. What *should be* the fair representation of women in typically male-dominated environments such as finance or tech? It is well documented that and widely debated as to why women drop out of STEM careers at a high rate^{vi vii} – but they do, around 40% of them. The explanations go from “hegemonic and masculine culture of engineering” to challenges of combining work and childcare disproportionately affecting new mothers. What would be the fair outcome in tech recruitment then? A % representation of women in line with present-day average? A mandatory affirmative action-like quota? (if so, who and how would determine the fairness of the quota?) 50/50 (with a small allowance for non-binary individuals)?

And what about additional attributes of potential (non-explicit) discrimination, such as race or nationality? The 2020 Oscars provided a good case study. There were no females nominated in the Best Director category (a category which historically has been close to 100% male, with exactly one female winner, Kathryn Bigelow for “the hurt locker” and 5 female nominees, and zero black winners and 6 nominees), and only one black person across all the major categories combined (Cynthia Erivo for “Harriet”). Stephen King caused outrage with his tweet about how diversity should be a non-consideration – only quality (he later graciously explained that it was not yet the case today^viii). Then South Korean “parasite” took the Best Picture gong – the first time in the Academy Awards history the top honour went to a foreign language film. My question is: what exactly would be fair at the Oscars? If it was proportional representation, then some 40% of the Oscars should be awarded to Chinese movies, another 40% to Indian ones, with the remainder split among European, British, American, Latin, and other international productions. Would that be fair? Should special quota be saved for the American movies given that the Oscars and the Academy are American institutions? Whose taste are the Oscars meant to represent, and how can we measure the fairness of that representation?

All these thoughts flashed through my mind as I was staring (somewhat blankly, I admit), at Dr. Oneto’s formulae. The formulae are a great idea, but determining the distributions to measure the fairness against… much more of a challenge.

The second speaker, Prof. Yvonne Rogers of UCL, tackled AI transparency and explainability. Prof. Rogers tackled the familiar topics of AI’s being black boxes and the need for explanations in important areas of life (such as recruitment or loan decisions). Her go-to example was AI software scrutinising facial expressions of candidates during recruitment process based on unverified science (as upsetting as that is, it’s nothing compared to fellas at Faception who declare they can identify whether somebody is a terrorist by looking at their face). While my favourite approach towards explainable AI, counterfactuals, was not mentioned explicitly, they were definitely there in spirit. Overall it was a really good presentation on a topic I’m quite familiar with.

The third speaker, Prof. David Barber of UCL, talked about privacy in AI systems. In his talk, he strongly criticised present-day approaches to data handling and ownership (hardly surprisingly…). He presented an up-and-coming concept called “randomised response”. Its aim is described succinctly in his paper^ix as “to develop a strategy for machine learning driven by the requirement that private data should be shared as little as possible and that no-one can be trusted with an individual’s data, neither a data collector/aggregator, nor the machine learner that tries to fit a model”. It was a presentation I should be interested in – and yet I wasn’t. I think it’s because in my industry (investment management) privacy in AI is less of a concern than it would be in recruitment or medicine. Besides, IBM sold me on homomorphic encryption during their 2019 event, so I was somewhat less interested in a solution that (if I understood correctly) “noisifies” part of personal data in order to make it untraceable, as opposed to homomorphic encryption’s complete, proper encryption.

In the only presentation from the business perspective, Pete Rai from Cisco talked about his company’s experiences with broadly defined digital ethics. It was a very useful counterpoint to at times slightly too philosophical or theoretical academic presentations that preceded it. It was an interesting presentation, but like many others, I’m not sure to what extent it really related to digital ethics or AI ethics – I think it was more about corporate ethics and conduct. It didn’t make the presentation any less interesting, but I think it inadvertently showed how broad and ambiguous area digital ethics can be – it’s very different things to different people, which doesn’t always help push the conversation forward.

The event was part of a series, so it’s quite regrettable I have not heard of it before. But that’s just a London thing – one may put all the work and research to try to stay in the loop of relevant, meaningful, interesting events – and some great events will slip under the radar nonetheless. There are some seriously fuzzy, irrational forces at play here.

Looking forward to the next series!

Sources:
i https://www.turing.ac.uk/sites/default/files/2019-06/understanding_artificial_intelligence_ethics_and_safety.pdf

ii https://arxiv.org/ftp/arxiv/papers/1906/1906.11668.pdf

iii https://www.salon.com/2019/02/19/why-women-are-dropping-out-of-stem-careers/

iv https://www.sciencedaily.com/releases/2018/09/180917082428.htm

v https://ew.com/oscars/2020/01/27/stephen-king-diversity-oscars-washington-post/

vi https://arxiv.org/pdf/2001.04942.pdf

vii https://www.sciencedaily.com/releases/2018/09/180917082428.htm

viii https://ew.com/oscars/2020/01/27/stephen-king-diversity-oscars-washington-post/

ix https://arxiv.org/pdf/2001.04942.pdf