Knowledge Centre Data & Society – Tool: AI Blindspots 2.0 - development phase

1

Discrimination by proxy

You are not allowed to discriminate against people on the following data categories: gender, ethnicity, religion, race, .... Most organisations avoid this by not collecting this type of data or using this data in feature selection. But have you considered how proxy-data categories can lead to the same discrimination? Shoe size is for example a proxy for gender.

HAVE YOU CONSIDERED?

Specific exceptions or practices in the context you are implementing your AI system?
Inviting affected stakeholders to stress test your system against historical biases?
Identifying and removing features that are correlated with vulnerable social groups?

HOW NOT TO

An AI system that predicts which patients would benefit from extra medical care prioritised healthier white patients instead of more at risk black patients. The algorithm was based upon how much a patient would cost to the healthcare system in the future, but did not consider that black patients spend less on medical care than white patients with the same chronic conditions.

TOOLS & TRICKS

Involve domain experts
Q 1 & 3: check for unintentional correlations that might impact vulnerable groups
Q 2: contextmapping, workshop on participatory approaches to machine learning
Q 3: Aequitas

Back to overview

2

Explainability

Why is explainability important? The predictions or recommendations generated by your AI system can be unclear and may be surprising. When creating an AI system, you have the responsibility to clearly inform users about the underlying technical logic of the system and how predictions or recommendations are generated.

HAVE YOU CONSIDERED?

If people trust the choices made by your system?
What the impact is of having an AI system generating a prediction versus a human?
How you can interpret or explain the choices of your AI system?

HOW NOT TO

A medical authority in the US used an AI system to determine reimbursements for disabled people. However, the court stated that these reimbursements were not possible because the decisions of the AI system were not explained.

TOOLS & TRICKS

Q 1 & 2: human-centered design methods
Q 3: Lime, WhatIf
Q 1, 2 & 3: explain your AI system to a random person and check if the results of your solution are clear and comprehensible, AI Explainability 360, Value Proposition Design

Back to overview

3

Performance balance

When determining an AI system’s metrics for success, trade-offs between optimal performance and negatively impacting vulnerable social groups must be made.

HAVE YOU CONSIDERED?

If the chosen performance indicators will not stray the AI system from its original purpose?
Which performance indicators are necessary and what the impact of these indicators will be on vulnerable social groups?
How statistically accurate your AI system is?

HOW NOT TO

AI can help by screening for cancer. However, if it is optimized to detect all potential persons with cancer, this will result in a higher amount of false positives. These can cause unnecessary anxiety with those persons without cancer.

TOOLS & TRICKS

Q 1 & 2: interviews with domain experts, IoT stress test (part of the Internet of Things Design Kit)
Q 3: check your method for statistical accuracy, intermediate/prototype testing

Back to overview

4

Inclusion/ommission check

Your AI system might be beneficial to most people but have you considered how specific people might be worse off? Consider if your system is inclusive for economically vulnerable persons, people with lower digital literacy or people with a disability.

HAVE YOU CONSIDERED?

How your system might exclude (vulnerable) people?
How people might be digitally excluded with your system?
How to minimize the number of affected people by your AI system?

HOW NOT TO

AI systems are often perceived as enablers for digital inclusion. They can for example detect atypical browsing behaviours and thus identify people’s difficulties when browsing the Internet. But what if things are the other way around and your AI system has a negative effect on (digitally) vulnerable people? How will you ensure your AI system is made for all and can be used by all?

TOOLS & TRICKS

Q 1: inclusion by design tool of KCDM (coming soon)
Q 2: 8 Profiles of Digital Inequalities: can all profiles make use of or benefit from your AI system?
Q 3: involve UI designers, organize a co-creation workshop with targeted end-users

Back to overview

5

Availability of data

The data on which you want to build your AI system may not be available or easily accessible to you, or may not be allowed to be shared with other actors.

Envisioned job profile(s): IT

HAVE YOU CONSIDERED?

Are the data you need for your AI system available and accessible to you?
Are the data you want already digitised?
If not, will the effort and cost to digitise the data outweigh the benefits of being able to access the data?

HOW NOT TO

Strong advances can be made in treating multiple sclerosis (MS) if an AI system could analyze the data from MS patients all over Europe. Unfortunately, the data is in many cases locked, unfindable or unreadable by the system...

TOOLS & TRICKS

Follow the FAIR data principles when you are collecting/storing data.
Q3: Discuss the ROI of the digitisation of data together with the management

Back to overview

6

Contextual factors

The implementation of an AI system is preceded by a great deal of research and development in order to maximise the functioning of the system. But what if the predetermined context in which the system is to function changes and no longer corresponds to what was analysed? As a result, the system might not work as accurately as estimated.

Envisioned job profile(s): IT

HAVE YOU CONSIDERED?

Do you regularly compare the training and testing data with the current situation?
Do the input data and predicted values align with the expectations?
If needed, do you have a plan to remedy or to phase out the use of the AI system?

HOW NOT TO

A smartphone app was developed which allows patients to test if they are infected with the coronavirus by coughing to their smartphone. The app was trained with a mass of recordings of coughs. The app is very functional and a lot of people use it. However, when the coronavirus mutates, the app is not retrained with new data, leading the app to miss a significant amount of positive cases.

TOOLS & TRICKS

Q1 & Q2: Identification and correction of dataset shifts, learn about concept drift

Back to overview

7

Dataset shift

A significant difference between your training datasets and testing datasets can result in what is called a ‘dataset shift’. This can heavily impact the performance of your algorithms.

HAVE YOU CONSIDERED?

A systematic flaw in the data collection or labeling process that causes a nonuniform selection of training examples from a population and results in biases during the training of an AI system?
If your data is (un)affected by shifts in time and location?

HOW NOT TO

If certain species are omitted in the training set of an image classification system for cats and dogs, the test set will reveal that not all images can be classified correctly.

TOOLS & TRICKS

Identification and correction of dataset shifts

Back to overview

8

Data minimisation

According to the General Data Protection Regulation (i.e. GDPR) you are not allowed to collect more (personal) data than needed for the functioning of your AI system. More data also means more data analysis, and more costs and effort to process and analyse the data.

Envisioned job profile(s): IT & management

HAVE YOU CONSIDERED?

What data should be collected for the proper functioning of the AI system?
How long should the data be stored for the proper functioning of the AI system?
Have data been collected that are not strictly necessary for the functioning of the system?
Can you comply with the data subject’s rights of the GDPR?

HOW NOT TO

The hospital has put up an experiment for a group of patients with a specific disease and uses an app developed by a company that gathers data about a standard set of parameters. This set, unfortunately, includes data that are unnecessary for analysing the disease type of the group of patients.

TOOLS & TRICKS

Q1 & Q2: Interviews with domain experts
Q3: If necessary, set up a process wherein you decrease and prevent the collection of more personal data than needed for the functioning of the system
Q4: Sit together with the legal department

Back to overview

9

Security

Because of the sensitivity of healthcare data an adequate security policy for the collection, storage and sharing of data with other parties is required.

Envisioned job profile(s): IT & management

HAVE YOU CONSIDERED?

Do you have a data security policy?
Did you assess who can access the data and who cannot, and more importantly why actors can or cannot access the data?
Do you register who has accessed the data, at what time and for what purpose?
Is there a procedure to report violations?
Did you test if the AI system cannot be hacked? Do you test the AI system periodically?

HOW NOT TO

Two doctors are experimenting with a new AI system and store the medical images which the AI system analyses on a server that is not perfectly secure. By doing so, they potentially expose the diagnoses and the individuals to whom they are related to hackers.

TOOLS & TRICKS

Q2 & Q3: Data Protection Impact Assessment, yearly internal audit
Q5: Penetration test

Back to overview

10

Human vs machine

An AI system can facilitate the human aspect of care, by giving care professionals more time to interact with patients. Yet the suggestions made by the AI system about the right diagnosis or treatments can deviate from the personal intuition, ‘gut feeling’, and evaluation of the care provider. It can be difficult for care professionals to deal with this balance between the human assessment and the machine.

Envisioned job profile(s): doctors, nursing staff & management

HAVE YOU CONSIDERED?

Are the results of the AI system reviewed by a human professional?
Is there a policy in place on how the assessment of healthcare professionals relates to the one of the AI system?
Can healthcare professionals freely report their concerns with regard to the decisions of an AI system?
Who is responsible for (1) the collection, the storage, the sharing, and the analysis of the data, and (2) for choosing the right treatment and monitoring the patient?
Who has the final responsibility?

HOW NOT TO

An AI system analyses a patient’s tumour, compares it with its database and makes a suggestion about the zone that needs to be radiated. Based on this suggestion, the radiologist makes the final call. One day, he follows the suggestion and overlooks an important parameter that was not submitted to the AI system and which implied that the suggestion was wrong. The patient dies because the radiation damaged a vital part of the brain. The radiologist believes the AI system is to blame.

TOOLS & TRICKS

Inform care professionals about the functioning of the AI system and about the limits of its performance.
Q2: Organise a discussion about how to deal with the ‘authority’ of an AI system and bundle the outcome into a clear policy. (Q2)
Q4 & Q5: Log the choices that were made during the collection/storage/sharing/analysis of the data (eg. by making use of the Data Collection Bias Assessment)

Back to overview

11

Data governance & privacy

Privacy with regard to the data subjects is crucial when making use of health data. However, privacy is not a clear-cut concept. Data can, for example, be made anonymous by removing the names of the patients. Yet, because of other parameters in the data set (which are often particularly interesting to feed AI models), it is sometimes still possible to identify patients. Furthermore, the difference between anonymisation and pseudonymisation is not always clear. And is it righteous to use the data you have collected?

Envisioned job profile(s): IT, management

HAVE YOU CONSIDERED?

What measurements did you take to protect the data subjects and their data?
Can you ensure the anonymity of the data subjects? If not, are they informed about this?
Have you considered how proxy-data categories (e.g. shoe size for gender) can also identify personal information?
Are you complying with the General Data Protection Regulation (i.e. GDPR) and other regulations?
Is it still righteous to use the data?

HOW NOT TO

A hospital has a large electronic health record which it wants to use to train an AI system that can predict the risk of having a heart attack. They removed the names of the patients in the data and replaced them with a code so they are able to retrace patients that need to be informed about the risk of having a heart attack. The hospital did not ask patients for consent.

TOOLS & TRICKS

Q1 - Q4: Because of the sensitive character of medical data, it is a responsible strategy to always ask for consent to collect and use the data and to at least pseudonymise these data
Q4: Guide ‘Artificiële intelligentie en gegevensbescherming’, check with the ethical committee which regulations you have to comply with

Back to overview

Downloads

Below, you can find 2 downloads:

A PDF of the AI Blindspot card set.
A PDF with 2 templates to use the AI Blindspots card set. With the first template, you start from an ethical dilemma and use the AI Blindspots card set (workshop method 1 and 2). You can use the second template for the reversed brainstorm with the AI Blindspots card set (workshop method 4). A filled-in example of the templates is provided as an example. Visit the main page of the AI Blindspots card set for more information about the methods to use the card set.

AI Blindspots healthcare (QR codes)

Templates AI Blindspots session

Draaiboek workshop AI Blindspots healthcare

De kaartenset werd aangepast op basis van 'AI Blindspot' van Ania Caldeeron, Dan Taber, Hong Qu and Jeff Wen, die de kaartenset ontwikkelden tijdens het Berkman Klein Center een MIT Media Labs's 2019 Assembly Program. De AI Blindspots kaartenset is verkrijgbaar onder een CC BY 4.0 licentie.

Het Kenniscentrum Data & Maatschappij paste de originele kaartenset
aan de Vlaamse context aan, om de ontwikkeling van betrouwbare AI in Vlaanderen te ondersteunen.

GO BACK TO THE TOOL:

AI Blindspots 2.0

AI Blindspots card set 2.0: development phase

Discrimination by proxy

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Explainability

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Performance balance

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Inclusion/ommission check

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Availability of data

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Contextual factors

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Dataset shift

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Data minimisation

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Security

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Human vs machine

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Data governance & privacy

HAVE YOU CONSIDERED?

HOW NOT TO

TOOLS & TRICKS

Downloads

GO BACK TO THE TOOL: