Revealed: How Respeecher Took Part in Creating a Digital Vince Lombardi for Super Bowl LV

For decades now, Super Bowl sponsors have produced some of the most memorable commercials before and during the most important game of the season. This year, the NFL brought the American football legend back to life on screen.

Vincent Lombardi, a distinguished coach and NFL executive, spoke from the stadium’s screen, minutes before the singing of “America the Beautiful”. Lombardi’s faith in the ability of humanity to unite and overcome any obstacles resonated with the nation entering its second year of battling COVID-19.

If you haven’t yet watched the commercial, you can do so now:

How was the digital Vince Lombardi created?

The return of the legend was made possible by the collaboration between the NFL, LA creative agency 72andSunny, Digital Domain VFX studio, and Respeecher.

It all started when the NFL and 72andSunny decided to bring a star from the past back to life. This person was supposed to remind us of a time when our society was united in the face of challenge.

With the occasion in mind, it would be hard to imagine a better person to lead this call of unity than Vince Lombardi. With the support of Lombardi’s estate, Digital Domain went to work on creating Lombardi’s digital avatar.

At the center of its creation was Digital Domain’s Charlatan Machine Learning system. For a better understanding of the process for creating a digital avatar, we recommend reading our article about digital humans.

For now, let’s just say that creating Vince Lombardi’s digital identity was not easy. The machine learning system was fed every possible video of the trainer who passed away in 1970. Usually, a few hours of video recording is enough for the system to generate a believable avatar. But not in this case.

Due to the poor quality of old video footage, engineers had to edit the original videos to make them clearer for the algorithms. Everything went into the process, from the HBO documentary to rare footage that had never been released on screens.

As a result, Charlatan managed to build an animated model capable of accurately reproducing the natural movements of the coach’s body and the features of his facial expressions.

Using a stunt double whose body was similar in size and movement to Lombardi’s, the studio was able to recreate the legendary trainer in high definition in just a couple of weeks.

The result of the studio’s work was the appearance of Vince Lombardi at the stadium before the final game. His human-sized figure accompanied the gridiron pitch, giving way to the coach’s lifelike presence.

One of the most memorable parts of the performance was the coach’s inspirational speech. As for the video, Vince Lombardi’s speech was synthesized using the latest technology. This crucial aspect of the digital production was entrusted to the team at Respeecher.

Synthesizing the voice of a legend

After receiving the first batch of audio recordings for the coach’s voice in early January, the Respeecher team immediately realized that this was not going to be an easy task. In order to create an authentic vocal model of Lombardi, Respeecher relies heavily on the quality of the original audio material.

Usually, to create a flawless audio clone, only one hour of original high-quality voice recordings is needed. For three weeks, the Lombardi model learned by analyzing older recordings of his voice and was then able to synthesize audio content that sounded exactly like Lombardy.

Unfortunately, Respeecher was short on high-quality recordings for Vince, so we had to compromise in many places. One of the main pre-processing steps that was taken was to hand-pick the cleanest and the most emotional phrases from the available audio.

In short, the voice cloning process develops as follows:

1. An artificial intelligence-based algorithm analyzes the original voice and a source voice (new speaker that would drive the system further).

2. The system then learns the unique features of the voices, precisely those that make the voice stand out by itself.

3. Then, a source speaker can record dialogue and our system will produce it using the voice of the desired person, all while preserving the natural features of the transformed dialogue.

In this case, it means that we were able to create emotional speech without distorting Lombardi’s character.

As you can see, the result was excellent. The final speech includes varying degrees of mixing of the source and the target voices at different times.

The cadence is 100% taken from the source actor though (which is what Respeecher usually does). Trying to modify the cadence of the source means giving up some control over the output speech, so usually, we try to avoid that.

Cutting costs and delivery time for resurrection and de-aging projects

After seeing the recognition that Lombardi’s performance earned, we can acknowledge that the effort was not in vain. Perhaps this is one of the most positive aspects of our work. To see how the audience rejoices at the appearance of Vince Lombardi which inspired millions of people.

There is also another, more material side to the work. With our help, filmmakers, game developers, and virtual bloggers are able to do things that, until recently, were quite impossible, all while saving time and money on production.

If your studio is thinking about or already working on a digital resurrection or de-aging project, Respeecher is the best fit when it comes to voice cloning. Projects like Vince Lombardi’s voice could often be quite challenging due to lack of available recordings. In cases when we have around 60 minutes of clean speech, voice cloning is smooth and fast.

We invite anyone interested in expanding their creative ideas based on modern technologies to cooperate. Contact us and we will come up with a custom solution just for your project.

This article was initially published on the Respeecher blog.



