Inclusive AI – Computer Vision: Representative Data and Annotators are Key to CV Success

By Appen. November 03, 2021

In the latest entry of our Inclusive AI series, we’re diving into the exciting field of Computer Vision (CV), covering how data annotators play a key role in creating inclusive CV solutions. CV is a fast-growing area of AI, one with a goal of helping computers to “see” in the way we humans do. This technology has many fascinating use cases. One you’re probably well-aware of is self-driving cars, which use CV to detect pedestrians, roadways, and street signs. But it also comes into play in medical imaging, evaluating crop yield on farms, and many other industries.

Any CV model needs to be trained on tons of images for its specific use cases before it can achieve even close to the accuracy of the human eye. That’s where data annotators come in: annotators will label images with the relevant characteristics they contain and the computer will then learn from these examples how to identify those characteristics in fresh image data. But annotators play an even more essential role: ensuring that CV solutions are unbiased, representative, and inclusive, especially for use cases that involve people and their everyday lives.

Diversity and Inclusion in Computer Vision

Computer Vision is an area where inclusion and ethical AI has come up repeatedly. Lack of diversity in the people who build AI has caused diversity gaps in CV technologies, some of which have made national news. For example, racial discrimination was recently found in top facial recognition AI; the AI could easily identify lighter-skinned men, but had a much more difficult time identifying darker-skinned women. Gaps like this shouldn’t exist at all.

Part of the problem is the lack of diversity in the actual data being collected. In our example above, the model builders didn’t include a diverse enough image dataset. As a result, the CV model wasn’t as familiar with all types of skin tones and failed to perform in production. The consequences of this are obvious: harm to the under-represented users, who want a product that works as well for them as it does for everyone else, and harm for the brand’s reputation.

Who’s annotating the data is just as important as the data itself. With a diverse pool of annotators, you’re achieving a great thing: you’re collecting a wide range of perspectives based on different demographics, experiences, and geographic locations. It would be impossible to capture this valuable input using people from similar locations and demographics. Depending on what type of model, lack of diverse annotators could create significant gaps in performance for the end users.

Computer Vision Real-life Examples

Appen’s clients pursuing CV projects rely heavily on our global crowd of annotators, especially when their solutions support customers in several markets. Here are a couple examples of real Appen clients who needed our data annotation help:

GumGum Improves Webpage Content Analysis

As their main line of business, GumGum reviews webpage content and classifies it according to what it contains. This helps advertisers identify which webpages are best suited for their ads both from a relevancy and brand-safe perspective. To scale their operations to be able to review more webpages, GumGum approached Appen for help. Our annotators were called on to review webpage content for various things: harmful content, people’s faces (including celebrities), animals, and any other attributes that may be relevant to the ad.

With Appen’s data annotation platform, GumGum achieved quick turnaround on annotation jobs, and was better able to serve their customers. It was a benefit to them to have access to our global crowd; for example, the definition of “harmful” content will often vary by demographic groups. Using a diverse crowd would help capture these types of nuances. Also, identifying geographic- or culture-specific webpage content, such as celebrity faces, was more doable with annotators sourced from various locations.

For more details, read the full case study.

SHOTZR Adds Location Labels to Images

SHOTZR provides imagery for digital marketing, and has a collection of over 100 million assets. Marketers search for tags in the database to access the images they need, so the images must be accurately-labeled. At some point, SHOTZR realized that many of their images needed location labels. For example, marketers searching for “New York City” want to see specific images of the city, but if those images aren’t tagged with that location they may not appear in the results.

SHOTZR reached out to Appen for help, and through our data annotation platform, they were able to identify which images needed additional location labels. For this type of task, it’s important to have annotators from all of the potential geographic locations that SHOTZR had images of. Otherwise, there could be images that were misidentified, or categorized as not requiring location labeling. As an end result, marketers would have had a more challenging time finding precisely what they were looking for. Fortunately, our global crowd helped SHOTZR avoid this outcome and identify thousands of assets that needed location labeling.

For more details, read the full case study.

Looking Ahead

Having an inclusive team involved in building and annotating CV projects makes a huge difference in the success of the AI. In a field that relies on images, it’s crucial that we pay close attention to how we’re using those images and labeling them, especially if they contain people. After all, we all want to use AI that works just as well for us as it works for everyone else. As the CV field addresses these kinds of diversity and representation issues, it will be one to watch in the movement toward more responsible, ethical AI.

If you’re interested in learning more about how Computer Vision works, read this article next.