A few years ago, the term “deepfake” didn’t even exist. Then, in May 2019, AI startup Dessa released an AI-generated version of podcaster Joe Rogan.
Dubbed “Faux Rogan,” the deepfake’s words were based on the AI’s analysis of thousands of hours of podcasts made by the real Joe Rogan. And as Dessa’s own quiz demonstrates, distinguishing the real Rogan from the fake one can be difficult.
In November 2019, Dessa released a video version of Faux Rogan that captured not only the timbre and speech mannerisms of the real Rogan, but his facial expressions, his movements and even the acoustic nuances of his recording space.
“Rogan’s life in the public eyes makes it easy to train an AI system to pick up on the subtleties of how he speaks,” Thor Benson at Inverse says. “It’d be significantly harder to do something like this with the average person, but this technology is advancing pretty quickly.”
And footage of a well-recorded public figure could be weaponized in any number of ways.
In a future where anyone can make anyone else do anything on video, methods of spotting deepfakes will be more valuable than ever.
What is a Deepfake?
The word “deepfake” is a portmanteau of “deep learning” and “fake,” describing the recordings’ origins. Deepfakes use AI and machine learning to analyze audio or video, then generate realistic versions of those same recordings.
Currently, deepfakes are created by generative adversarial networks (GANs), which pit two machine learning models against one another, says J.M. Porup, a reporter at CSOonline. As one ML model creates forgeries, the other tries to detect them. The process continues until the first ML model gets good enough at making forgeries that the other model can’t tell what’s a forgery.
“The larger the set of training data, the easier it is for the forger to create a believable deepfake,” Porup says. “This is why videos of former presidents and Hollywood celebrities have been frequently used in this early, first generation of deepfakes — there’s a ton of publicly available video footage to train the forger.”
The technology is improving quickly, too. “It’s now become possible to create a passable deepfake with only a small amount of input material — the algorithms need smaller and smaller amounts of video or picture footage to train on,” says Katja Bego, principal researcher at Nesta.
As a result, the number of deepfake videos online nearly doubled between December 2018 and July 2019, tech reporter Davey Winder writes at Forbes. The vast majority of those videos were porn, but the use of deepfakes to influence elections and other popular opinions isn’t far behind.
Faked Recordings and Media Veracity
Altered videos are already being used to sway political opinion. In May 2019, for instance, video of House Speaker Nancy Pelosi stumbling over and slurring her words appeared on Facebook and YouTube. It turned out, however, that the video had been altered by slowing it down, making it appear as if Speaker Pelosi were struggling through her remarks.
The original video, captured by C-SPAN, showed the speaker speaking normally and with clarity, Sarah Mervosh at The New York Times writes.
It seems inevitable that deepfake technology will soon be used to convince the electorate that public officials are saying or doing things they did not, in fact, do.
That threat has already led to political action. In October 2019, California passed a law criminalizing the distribution of audio or video that portrays a false, damaging image of a politician’s words or behavior. The bill covers distribution of this material within 60 days of an election.
“While the word ‘deepfake’ doesn’t appear in the legislation, the bill clearly takes aim at doctored works,” Colin Lecher at The Verge reports.
China has also passed a law banning the publication of deepfakes without stating that they were created with AI or VR technology. Under the new law, publishing deepfakes without disclosing their source is a criminal offense. China’s new law takes effect January 1, 2020.
“China’s stance is a broad one, and it appears the Chinese government is reserving the right to prosecute both users and image and video hosting services for failing to abide by the rules,” Nick Statt at The Verge writes.
Spotting Faked Voice and Video Recordings
Spotting deepfakes is a race against time. The technology used to create these altered audio, video and photographic records is typically a step ahead of our ability to detect them, either through our own senses or with the use of AI and other technologies.
As a result, fighting deepfakes becomes a matter of trust. Do we trust our own senses? Do we trust the tools we rely on to spot altered recordings? How do we verify our instincts when it comes to potentially faked voice or video?
Tools to Spot Deepfakes
Machine learning and generative adversarial networks have made deepfakes possible. They’re also making it possible to spot deepfakes easily.
For example, researchers Yuezun Li and Siwei Lyo at the University of Albany-SUNY used AI to help identify images that had been altered via tactics such as placing one person’s head on another person’s body. The researchers trained the system to “target the artifacts in affine face warping as the distinctive feature to distinguish real and fake images.” Researchers David Guera and Edward J. Delp at Purdue University have used neural networks to identify potential deepfakes in similar ways.
A study by a group of California researchers led by Jason Bunk also seeks to use deep learning to spot deepfakes. In the California research, AI is used to spot resampling features and then create a heatmap, isolating the digitally manipulated portions of an image from the original. Their method can identify not only swapped-in images, but also “scaling, rotation or splicing,” which are also commonly used to change the narrative expressed by a video or image.
While deepfake technology can map the face of one person, such as a politician, onto the face of an actor, it cannot currently do so without leaving some trace of the mapping action. Projects like these work on identifying these traces, essentially finding the “signature” left behind by deepfake action.
Training Yourself to Recognize Deepfakes
While deepfakes are increasingly convincing, they’re not yet perfect.
“Presently, there are slight visual aspects that are off if you look closer, anything from the ears or eyes not matching to fuzzy borders of the face or too smooth skin to lighting and shadows,” says cybersecurity strategist Peter Singer.
Training yourself to spot deepfakes can mean training other skills, as well, such as the ability to contextualize information and corroborate it. For instance, when faced with a potentially explosive political video, the ability to pause and ask, “Where else can I learn about this?” can be very valuable.
Ultimately, that awareness can be enough to interrupt the otherwise automatic thought process of believing what we see. Raising awareness is why the team at Dessa has worked so hard to make Faux Rogan appear as realistic as the flesh and blood Joe Rogan, says Dessa co-founder Ragavan Thurairatnam.
“The unfortunate reality is that at sometime in the near future, deepfake audio and video will be weaponized,” Thurairatnam writes. “Before that happens, it’s crucial that machine learning practitioners like us who can spread the word so proactively, helping as many people as possible become aware of deepfakes, and the potential ways they can be misused to distort the truth or harm the integrity of others.”
The days of “seeing is believing” may be behind us. But the days of “seeing is asking” are beginning — and the sense of curiosity that asking requires may benefit us all.
Images by: George Mayer/©123RF.com, microgen/©123RF.com, primagefactory/©123RF.com
Latest posts by Epic Presence Team (see all)
- Navigating the Media Today: Can We Learn to Spot Deepfakes? - January 14, 2020
- From Uncanny Valley to Friendly Neighbor: These 5 AIs Are Surprisingly Lifelike - December 10, 2019
- Digital Truth: How Fact-Checkers Try to Wrangle Lies on Social Media - December 3, 2019