AI identifies household items of people in lower income countries than developed countries

This is an example of the "bias" of AI, reflecting how global inequality is happening.

Algorithms that identify objects provided by companies like Google, Microsoft and Amazon operate less accurately when asked to identify items of people from low-income countries.

This is the result of a study conducted by Facebook's Artificial Intelligence lab. The study shows that the bias of AI is not only a testament to the inequality existing within countries, but also between countries.

Researchers have tested five of the most popular object identification algorithms provided by technology giants - including Microsoft Azure, Clarifai, Google Cloud Vision, Amazon Rekognition and IBM Watson - to verify that how accurate these software are to identify indoor objects. Images are taken from a collection of data on a global scale.

The data set consists of 117 groups of objects (including almost everything, from bath soaps to sofas), collected from households from many geographic areas and with different income levels. (from a family in Burundi with an average income of 27 USD / month to a family in Ukraine with a monthly income of 10,090 USD).

The researchers found that the object identification algorithm made a mistake when identifying the belongings of a family with a median monthly income of $ 50 more than 10% of the furniture a family earns on. $ 3,500 / month. If comparing absolute values ​​for accuracy, the difference is even greater: the algorithm to identify American households' items is 15-20% more accurate than household items from Somalia and Burkina Faso.

The results of this study " are relatively consistent between cloud services that provide image recognition, " the study authors said.

Picture 1 of AI identifies household items of people in lower income countries than developed countries
The photo identifies AI soap.

In the picture above, the researchers gave five systems to identify objects "see" two shots of soap, on the left is a colorful cake soap, taken in a Nepalese family with average income 288 USD / month, and on the right is a bottle of liquid soap taken in a UK household, with an average income of 1890 USD / month. As can be seen, the AI ​​often mistook Nepal's "soap cakes" into foods like "bread", "pastries" , "sandwich" (Azure's AI), "cooking", "delicious". , "good for health" (Clarfifai's AI), "cuisine", "dish" (Google's AI), "candy", "burger" (Amazon's AI), "food products", " seasonal food "," turmeric "(AI of Watson)," fast food "," nutrition "(AI of Tencent) .

Meanwhile, when identifying the image of the British household soap jar, the AI ​​gives quite accurate results with the keywords: "sink", "faucet", "liquid", "water" , restrooms "," bathroom supplies "" soap "," body lotion "," pump soap bottles ".

This "bias" is a common problem for artificial intelligence systems and has many different causes of origin. One of the most common problems is because AI training data is used to create algorithms that often reflect the lives and backgrounds of engineers who created them. Because the engineers of artificial intelligence are usually white men from high-income countries, the "world" that AI learns is also the world of those engineers.

One of the best known examples of AI bias is the facial recognition algorithms, which often perform poorly and effectively when identifying female faces, especially women skin. color. This type of bias can "root" deeply into every type of system, from algorithms designed to assist employers looking to browse thousands of resumes and CVs before inviting them to live interview.

In the case of algorithms that identify objects, the authors of this study say there are a number of possible causes for these errors: first, training data is used to create systems. AI is often limited to a specific geographical area, and secondly, they reflect profound cultural differences.

The authors also said that the training data used for image observation and recognition algorithms is often taken from European and North American countries, so they "lack a lot of visual data samples." of many geographical regions with a large population in the world, including Africa, India, China and Southeast Asian countries ".

Similarly, most image data sets use English nouns as their starting point and based on that to collect other equivalent data. This will lead to a situation when setting the context of another country, the English noun will not indicate any objects or its meaning will be misleading. The authors take the example of soap in the picture above: in some countries, soap is usually in the form of a cake, while in other countries soap is often in liquid form. Or a wedding in America compared to India may have a very different form.

But why is this important? Because systems that operate based on these algorithms will work incorrectly for people from low-income or non-Western countries. Since US technology companies are currently leading in the field of artificial intelligence (AI), this situation can affect nearly all services based on this technology, from features search for images of online photo storage services, up to paramount systems such as automatic security cameras and self-driving cars.

However, this is just the tip of the iceberg. Image algorithms are relatively easy to retrain to eliminate such "biases" ; but the producers of these software may also be the "industry leader" of an industry full of similar algorithms without close monitoring to ensure the "fairness" of information and data between countries and territories.

Silicon Valley companies regularly promote and promote their artificial intelligence (AI) technologies to be "fair" and everyone can use them. However, studies like this have shown how technology companies continue to shape the world in their own perspective.