Sean Spicer and the Cruelty of an Image
Thanks to facial recognition, few of us will get a chance at redemption
After Sean Spicer made his Dancing with the Stars debut this week — wearing a florescent green pirate shirt and white pants as he salsa danced to the Spice Girls’ “Spice Up Your Life” — I ran an image of his wooden performance through a facial recognition program, just to see what would happen.
The former press secretary quit his job in 2017, and has since capitalized on his tenure within the Trump administration by landing a lucrative book deal, paid speaking gigs, a prime spot at a Hollywood award show, and now, a spot on a nationally televised dancing competition. In his introductory video for ABC, Spicer admitted his time in the White House was “tumultuous.” “I think it gave people a very one-dimensional look at who I am as a person,” he said. This was his chance at redemption. Indeed, one judge deemed his offbeat performance as “strangely entertaining.”
The facial recognition program I used found different words to describe Spicer. I uploaded a screenshot into ImageNet Roulette, a site designed to label people in photos. Evaluating his face, the program called Spicer: a “flop,” a “dud,” and a “washout.”
It would have been an entirely trite exercise but for the fact that the program, in its definitive assessment, offers up a weird kind of instant power: that of data-driven judgment. And thanks to databases like ImageNet, used to train recognition technology, we are all either already, or soon to be, subject to that arbitrary power. Our reputations will come to depend on it. And creating — not to mention restoring — our image à la Spicer will become increasingly difficult to control.
I chose Spicer as a test subject for the program because of the added irony that his debut as press secretary will remain in the collective memory as both bizarrely hilarious and mildly terrifying, all over how to accurately describe the contents of an image. Following Trump’s inauguration, Spicer refuted reality, insisting that photos of the crowd on the National Mall didn’t depict what was obvious to the naked eye — that fewer people attended the event than in previous years. According to Spicer, the photos were deliberately misrepresented to suggest the crowd was smaller than it actually was. Spicer declared that they had been “framed in a way to minimize the enormous support.”
The program classified her as “gook” and “slant-eye” — two seriously racist terms.
Like a lot of people, I found out about ImageNet Roulette from Twitter. Hours before Spicer took to the stage Monday night, Kate Crawford, co-founder of the AI Now Institute, tweeted a link to ImageNet Roulette. The site, designed for Crawford and artist Trevor Paglen’s recent exhibit on artificial intelligence, uses the public dataset, ImageNet, to categorize images in photos. Buzz for ImageNet Roulette quickly spread online, as users posted the results to the keywords the program dredged up to label them.
“ImageNet is one of the most significant training sets in the history of A.I. A major achievement,” Crawford tweeted. “The labels come from WordNet, the images are scraped from search engines. The ‘Person’ category was rarely used or talked about. But it’s strange, fascinating, and often offensive,” she explained. She was right. As many people discovered after uploading their photos to the site, its classifications are frequently random, and uncovering the machine’s inner thoughts has quickly become a kind of meme-game. But Crawford’s warning was also worthwhile. For instance, Guardian tech reporter Julia Carrie Wong discovered when she uploaded her photo that the program classified her as “gook” and “slant-eye,” two seriously racist terms.
But its offensive classifications were partly the point of showcasing the ImageNet publicly. The site “reveals deep problems with classifying humans — be it race, gender, emotions, or characteristics,” Crawford wrote. “It’s politics all the way down, and there’s no simple way to ‘debias’ it.”
Image classification, particularly as part of facial recognition programs, is quickly becoming a new policing tool. Already, the federal government is using facial recognition at the southern U.S. border with Mexico, in an attempt to deter migrants from crossing. In March, BuzzFeed reported that the Trump administration is pushing to have facial recognition replace passport checks at dozens of U.S. airports by 2021. Other instances of facial recognition are constantly popping up: as a replacement for money in stores, or to monitor students in schools. The debates about its usage — and the companies that build it — arise just as frequently. In other words, it’s politics all the way back up, too.
Which, strangely enough, brings us back to Sean Spicer, the man who told us we should see something we did not see, and feared his own one-dimensional portrayal. So let me be clear: I’m being deliberately unfair in discussing one specific picture of Spicer. Another photo I found labeled him as an “athlete.” I just chose not to use that one.
Which is the point. ImageNet Roulette shows how easy it is for a computer program to make an unfair, even racist, judgements — the kind that, in the near future, may limit our free movement, how we consume, or who we trust. The computer, like Spicer did at the White House, is prone to telling us to see something that’s not necessarily there, to accept an inaccurate interpretation of reality — that the complexity of human life, of personality and a person, can be reduced to a few simple keywords. When we first encounter it, it seems strangely entertaining, but gradually, it will just become normal. We may even come to believe the reality the computer portrays, the image of the world, and the people in it that it argues exists.
This process has already begun. It’s happening in public and private spaces as I write this. But, unlike Spicer, who has been granted an international platform to redeem what he thinks was a “one-dimensional” portrayal of his personality, few of those who will be simplistically judged in the future — not by a friendly panel on TV, but by a faceless computer program, and thus, by extension, by their government, their workplace or school, or maybe even by the mass media — will have the chance to argue they are not what the machine says they are.