Ultimate Deepfake Guide: How And Why They’re Made, And Their Future

5 min readSep 30, 2021


Image Source- towardsdatascience.com

In today’s media market, it’s hard to find technology that is more controversial than deepfake. On the one hand, it speeds up content production, makes it more affordable, and helps implement ideas that were previously considered impossible. On the other hand, the market has not yet produced a clear sense of the technology’s ethical aspects.

As deepfake continues to quickly transform the media landscape, its positive impacts on the market are most felt by content producers, and it won’t be going away any time soon. Let’s figure out why it is so in demand.

What are Deepfakes?

The word deepfake combines the terms “deep learning” and “fake”. It is an image or sound synthesis technique based on artificial intelligence.

Video Deepfake combines and overlays images and videos onto original video or graphic content in video production. In most cases, deepfake uses adversarial neural networks (GAN) to create such imagery. One part of the algorithm learns from actual pictures of a specific object. It then creates an image that is literally competing with the second part of the algorithm until it starts confusing the copy with the original.

Ultimately, deepfake systems allow you to endow one person with the appearance and voice of another. The point of this is so that an outside observer is unable to distinguish the original from the synthesized copy.

The ethics of deepfake use in tech

Now that you know how the technology works, you can begin to understand the controversy surrounding it.

For instance, a recent publication by Quartz gives you a sense of where the discussion is headed. Anne Quito’s article The Anthony Bourdain audio deepfake is forcing a debate about AI in journalism reflects the ethical side of using the voices of people who have passed away.

Although this particular case does raise some relevant legal issues, there are other more positive examples of voice recreation. This one is our favorite — How Respeecher Took Part in Creating a Digital Vince Lombardi for Super Bowl LV.

When dealing with sensitive technology and touching on such emotional topics, we must always be guided by the highest ethical principles. While a common industry standard has yet to be defined, we are happy to share with you our policies of honest work concerning deepfake technology.

In addition to ethical principles, the community is actively working on creating deepfake detection systems.

In 2020, when responding to new opportunities and the challenges they introduce, Microsoft announced their Microsoft Video Authenticator, software that allows you to detect tampering with video. We at Respeecher are constantly working on the same tech for audio authentication as well. Respeecher’s algorithms help recognize AI-synthesized speech, even if it hasn’t been watermarked by deepfake producers.

The good news is that regardless of the negative aspects of the technology, there are hundreds of cases where it has been used for good. Let’s take a look at some undoubtedly positive examples.

Deepfakes in conventional video production

There are many comical and interesting videos using deepfakes on the internet. On the Ctrl-Shift Face YouTube channel, the author changes the faces of the film’s main characters to other famous Hollywood actors. In his videos, Sylvester Stallone becomes the Terminator, Bruce Lee plays Neo in The Matrix, and Jim Carrey performs the lead role in The Shining instead of Jack Nicholson.

There are examples of deepfakes used in advertising. Beckham starred in a social video about the dangers of malaria. The same technologies helped him speak 9 different languages: native speakers pronounced the text, and an AI adjusted it to the athlete’s vocal articulation patterns.

Some authors use this technology for drawing attention to social problems. Bill Posters posts deepfakes with Zuckerberg, Kardashian, and Mark Freeman on social media. One of his most recent works is a teaser for a fictional TV project with Jeff Bezos, the head of Amazon. In this production, he tries to draw attention to the burning forests of the Amazon, including the attention of Bezos himself, who “borrowed the name of the forests for his company to become the richest man on Earth.”

Deepfakes are also widely used by

Deepfake technology is used not only in video production. Services like Respeecher (although you won’t find even a trio of these on the market) use similar machine learning-based technology to clone speech.

AI-based synthesized speech and its industry use cases

Voice cloning technology is based on the same principle as video deepfakes. The system analyzes the target voice until it has understood all its unique features. Next, we can take the voice of any other person, feed it to the system, and convert it to the target voice. Moreover, it will retain all the intonations and cadences with which the person delivered in their initial speech.

The process does not end there. The system can easily clone female speech into male speech and vice versa. We can also make a voice younger. In general, there is a lot that a non-specialist would recognize as an outright miracle.

The main advantage of AI-generated speech is that, unlike video, it is really indistinguishable from the original in most cases. Suppose machine learning algorithms continue to improve their quality for the foreseeable future (they will). How authentic will it become? In our case, speech synthesis is already at a level sufficient for widespread use in first-tier projects.

Check this funny video where our teammates mess around with the technology to see how the end product sounds.

Navigating the synthetic media landscape

Here’s a nice pic from our friends at Samsung summarizing the key industries that get the most out of AI content generation.

As you can see, several industries make use of voice synthesis such as game graphics, video, music, and mixed reality. More and more companies in different fields of content production are looking to decrease their costs and speed up the production process. Deepfake technology is more than capable of doing both.

Within a few years, we will come to a point where ML technology will be so high-quality and affordable that the content market will go through yet another revolutionary advancement. In the meantime, you can become a pioneer in the use of this advanced technology. Join leading Hollywood studios, YouTubers, video game developers, and marketing agencies. Adopt it today.

This article was initially published by Respeecher as a guest post on The Internet Stories.




AI Speech-to-Speech Voice Synthesis for Next Generation Content Creators

Recommended from Medium