The word “Deepfake”—a combination of “deep learning” and “fake”—refers to any media that uses artificial intelligence to simulate people doing or saying things they never did or appearing in fabricated situations without their consent.
Deepfakes came into public consciousness in late 2017 with the proliferation of AI-manipulated face-swap videos across the Internet. Many of the most common examples involved celebrity heads attached to the bodies of adult film stars. Deepfakes and the conversation around them have since moved from Reddit and 4chan to mainstream forums and social media platforms.
To be sure, there are many artistic and civic uses of synthetic media. AI can help create new forms of public history, provide innovative solutions to urban infrastructural challenges, and offer social critique. In Event of Moon Disaster itself, a multi-part installation and digital project, shows the creative potential of AI and warns against the dangers of disinformation. Deepfakes, however, are often geared towards more nefarious purposes – by far the most common use of deepfakes is pornography, where face-swapping constitutes a form of digital violence used to humiliate and harass. There is also potential for deepfakes to sow the seeds of confusion or to amplify “fake news.”
Media manipulation is a centuries-old phenomenon, spanning everything from spectacular hoaxes to deceptively edited films and photoshopped images. As scholars Britt Paris and Joan Donovan argue, deepfakes are part of a wide spectrum of audio-visual media intended to dupe or confound viewers. Media that aims to mislead draws on a variety of techniques such as speeding up, slowing down, re-cutting (with new inserts), and re-contextualizing the original clip. “Cheapfakes” or “shallowfakes” refer to more low-tech, swiftly created media involving doctored images and distorted context.
Four factors have contributed to the rise of deepfakes:
- advances in deep learning technologies and computational hardware
- the development of social media platforms that make it easier for people to share and access information
- a deregulated climate of media production and distribution
- the embattled state of contemporary journalism
Credible news organizations are forced to contend with outlets specializing in clickbait content, attacks against a free and independent press, and dwindling staff and resources.
To make the most common form of deepfake, the face-swap video, a creator replaces a “target face” with a “source face.” The idea is to give the impression that the source person was in fact performing the action depicted in the target person’s video. The creator first needs a “trained model” to make the swap, which involves algorithms processing a large data set of images related to the human face. This includes a wide range of facial features (eyes, ears, hair, etc.) as well as an equally wide range of particular qualities (tones, textures, etc.). The result is a trained model that not only possesses an inclusive understanding of the features of a human face, but the relationship between these features; for example, the positioning of the ear on the side of the head and the nose in the middle.