This post is the third entry in our series on the Ethical AI principles.
In the previous post, we discussed the first principle of the Ethical AI concept. In this post, we will discuss another principle of Ethical AI which refers to ‘ensuring an unbiased AI.’
We will start with a discussion on the bias in AI problem before moving onto the reasons for AI bias, the greatest challenges in overcoming it, and how to tackle the problem. Let’s begin.
Before we start the discussion in the bias in AI problem, let’s quickly recap what we learned about ‘ensuring an unbiased AI’ in the first post of this series of articles relating to the Ethical AI concept.
In the first post in this series, we learned that the main purpose of AI was eliminating bias in processes. Additionally, we learned that an AI system must be controlled for negative or harmful human bias during development and maintenance. This is to eliminate any kind of bias—whether it is related to age, race, gender, sexual orientation, etc.—in operation of the AI system.
It is critical for designers and developers of AI system to consider the dimensions of diversity relating to culture, humans, and systems from the very beginning of the process. Failure to do so may lead to the creation of AI systems with a default mode that is irrelevant to excluded groups; often, this is where the problem of discrimination or bias in AI starts.
When the concept of AI decision making was first introduced, many people believed that it would solve the problem of bias in society which had long gone unresolved. It was thought that computers won’t be affected by bias since they have no inherent views on things like age, race, gender, and sexual orientation.
This was true back in the day as computers had limited functionality. However, this changed with the rollout of machine learning. The Big Data explosion and a decrease in the costs of computing with adequate processing power that can handle it posed a new challenge.
In the past, the importance of high-quality data was succinctly summed up by the term ‘garbage in, garbage out’. This meant that if you provided computers with poor data, the results returned by them wouldn’t be very helpful or favorable.
In the early days of AI, this was a problem only for computer programmers and analysts. However, today it’s a problem for everybody since computers today are often asked to make decisions regarding inviting applicants to job interviews, someone’s eligibility for a mortgage, and other important things that affect lives.
Perhaps, the biggest example of bias in AI is the use of an algorithm that was biased against African-Americans by US parole authorities to predict the probability of criminals reoffending. This clearly shows that bias in AI exists. This problem must be tackled and resolved before it starts to have a lasting impact on society.
Today, most AI applications are based on deep learning, which is a category of AI algorithms. Another thing enabling these apps is the way deep-learning algorithms find patterns in data. Therefore, deep-learning algorithms can affect the lives of people and prolong bias in recruitment, security, retail, and other things.
So, we are aware that bias in AI exists but how does this problem arise in the first place? Many people think that it is due to bias in training data. While biased training data is one of the reasons for AI bias, it is only a part of the problem rather than the problem itself. In fact, the bias in AI can creep in long before any data collection and dissemination happens. While the problem of bias in AI occurs at many stages of the deep-learning process, the following three stages are where you need to pay the most attention.
When creating a deep learning model, the very first decision that computer scientists make is what they want to achieve. For example, a credit card company may want to create a deep-learning model to predict the creditworthiness of a customer. The only problem here is that ‘creditworthiness’ is a vague or unclear concept.
To convert ‘creditworthiness’ into a computable factor, the company must make a decision: does it want to maximize the number of loans to be repaid or does it want to maximize its profits margins. Once a decision is made, the company can define creditworthiness in the context of the selected goal.
However, here is where the problem of AI bias creeps in; say, the company chose to maximize its profits margins. Now, while this isn’t the company’s intention, the deep-learning algorithm may engage in predatory behavior to give out loans in order to maximize profits. This may lead to bias or discrimination in decisions about giving out loans even though the company did not intend it.
In training data, bias shows up in two different ways; the data collected either reflects existing biases or does not represent reality. For example, the latter may occur if you feed a deep-learning algorithm with more images of light-skinned faces than dark-skinned faces, it is inevitable that the face recognition system that results from this will be biased towards light-skinned faces since recognizing darker-skinned faces would be difficult for it.
The other way in which bias can show up in data collection can be explained with what happened with Amazon. The e-commerce giant recently found that its internal recruiting tool had been dismissing female candidates. On investigation, the company found that this bias in recruiting was because the AI system was trained on past hiring decisions which favored men over women.
Another stage of the deep-learning process at which bias in AI occurs is data preparation. This is the stage at which you choose the attributes which you want the deep learning algorithm to consider. Although it may appear as such, data preparation is very different from problem framing.
In data preparation, you can use different attributes to train a model for the same goal or use similar attributes to train a model for varying goals. For example, if you were to model ‘creditworthiness’, then the age or income of the customer could be an ‘attribute’.
On the other hand, the gender, years of experience, or education level of the candidate could be an ‘attribute’ in case of Amazon’s recruiting tool. Often, this is referred to as the ‘art’ of deep learning; the prediction accuracy of your model can be significantly affected by the attributes you choose to consider or ignore. While measuring the impact of this model’s accuracy is easy, it can be difficult to determine the impact on the model’s bias.
Now that you understand the problem of bias in AI and why it happens, you may want to fix the problem. However, solving the problem of AI bias is easier said than done. This is due to the following challenges that lie in the way of this task:
One of the greatest challenges in fixing AI bias is that bias in a deep-learning model isn’t obvious from the start. It is much later in the process that you start to realize the impact of your data and choices. At this point, it may be difficult to pinpoint where the bias originated and then find a way to eliminate it.
In the case of Amazon, engineers reprogrammed the recruiting tool after discovering bias in it. They re-trained the tool to ignore explicit gender-specific words like “women’s”. However, this did not solve the problem completely, as the AI tool continued to pick up on implicit gendered words that often resulted in it choosing ‘men over women’.
Another challenge in fixing AI bias is inappropriately designed or imperfect processes; many deep learning practices are designed without considering bias detection. Before deployment, deep-learning models are checked for performance. In theory, this would be the perfect opportunity to catch bias. However, there is no practical application of this.
Instead, data is split randomly by the developers/programmers and then trained. While one group is used for training, the other is reserved for validation after the completion of training. This means that the data used for checking the model’s performance has the same biases as the data used for training; this results in a failure to flag prejudiced results.
In addition to the above, some other challenges in fixing the problem of bias in AI are the lack of social context in framing problems for the deep-learning model and a failure to define the concept of fairness in mathematical terms.
Considering the challenges in fixing AI bias outlined in the previous section, achieving a bias-free deep learning system may be seen unattainable. However, that is not the case. While enabling a bias-free deep learning system can be a real challenge, it is not out of reach. Following are some of the ways to tackle the problem of bias in AI to enable bias-free deep learning models and AI systems.
One of the most important steps in solving the problem of bias in AI is defining attributes to create guidelines for the deep learning model. For example, if the deep-learning model is a system that answers a simple question such as ‘what is a human,’ then you need to define what human means here.
You need to be clear about what it includes and what it doesn’t. You need to identify the problem you’re solving and everything that could go wrong. Basically, you should know what you’re building and what problems the end-user could likely face. Once you have this information, start a deep review of your data. You must review your data for possible underrepresentation or bias in results; you need to think about all the issues related to bias that could arise in the deep-learning process. Some of the questions to ask yourself during the data review in order to identify and eliminate bias are:
By answering the above, you will be able to carefully define your problem and end-users. Additionally, you will be able to plan for your outcomes and solve many potential issues by paying careful attention to your training data.
AI systems do only what they’re taught; AI needs to be trained using historical data or examples. However, the use of biased data when training AI should be avoided; if this does not happen, then the decisions made by AI are likely to be biased.
If you were a high school teacher, would you teach your student using textbooks that were published more than a century ago? No, you wouldn’t! This is the exact thinking you need when training AIs. If you train AI on poor or biased data, then the outcomes will also be biased or poor.
For example, if the data used to train a deep learning model was from a time when men were likely to be hired than woman, then the AI will learn to prioritize men over women when making hiring decisions. Therefore, you need to make sure that the data you use to train an AI does not contain any unfair outcomes.
Another advice for tackling the problem of bias in AI is being careful with your use of data. Some of the best practices for this are:
Eliminating bias from AI is not only an Ethical AI requirement, but it also a key factor in the success of an AI system. In this post, we discussed the AI bias problem including what it entails, why it happens, the challenges in fixing it, and how to tackle the problem.
In the next article in this series, we will look at the next principle of Ethical AI as it is one of the keys to fulfilling the requirement of the Ethical AI concept.
Get in touch to learn how our AI powered solutions
can solve your business problem.