HeyGen’s AI video translation tool has generated quite a buzz. But does it live up to the hype? 

VeraContent’s team of linguists took on the challenge of creating AI-translated videos in multiple languages to uncover the tool’s true potential.

Putting this AI video translation tool to the test

Here’s our quick summary video of the results from this test.

HeyGen, an AI startup, has developed a tool for translating videos into multiple languages. While still in beta, the Video Translate tool clones the natural voice and adjusts lip movements to match the translated speech.

As a multilingual marketing agency, we wanted to put this new technology to the test! So, our CEO, Shaheen Samavati, recorded a video in English and used HeyGen Lab’s video translation tool to translate it into all of their available languages: French, Italian, German, Spanish, Korean, Hindi, Japanese, Mandarin, Portuguese, Dutch and Turkish.

We wanted to really test the translation and adaptation capabilities of the tool, so we purposely made sure to:

  • Act out different emotions 
  • Include tone and style variations
  • Add in a few tricky expressions (including “listen up!” “honest-to-goodness” and “shed some light”)

Shaheen also covered her mouth at one point to test whether that would impact the lip sync.

Using the tool was super easy. If you have the paid version, you simply upload your original video, select the language you want to translate it into, wait a few minutes for processing and download your video. 

There is a free version, but the processing time is incredibly long. It was unclear how many hours/days/weeks it would have taken to process our video if we hadn’t upgraded to the paid version. Which, as of September 2023, started at $59/month, which included 30 credits. Each translation of our video (which was a little over one minute) used 1.5 credits. 

While not perfect, the results are pretty impressive. Click through to see the full video in each language below:

Here’s a summary of all of the languages:

English translation

Since Shaheen can also speak Spanish, we decided to check out the tool’s English translation capabilities and recorded a Spanish version of the same video and used the tool to translate it into English.

Here’s what it did:

If you’re looking for video translations that convey the same meaning and emotion as the original, get in touch with our team at VeraContent! We work with native-speaking translators to ensure all translations are accurate and engaging for the intended audience.

Are AI video translation tools useful for global marketers? 

HeyGen Lab’s beta AI video translation tool can make it seem like you’re speaking another language pretty convincingly. However, the translation isn’t perfect, similar to existing machine translation capabilities. It lacks localization, and while native speakers are likely to still get the message, the content comes across as unnatural and without emotion. 

While it can get the basic idea across, it isn’t ideal for making meaningful connections with global audiences, which is essential in marketing.

“I see value in using it for personal branding purposes. Many influencers are starting to create content in multiple languages to reach more people and build a community worldwide as it’s more personal if you’re speaking the same language as their community.”

Lara Luig, native German-speaking translator

It may also be useful in more formal videos with less colloquial language and tone variations, as those are the areas the AI struggled with most.

Other possible use cases could be building it into a live video chat to bridge real-time communication or even using it in medical consultations or business meetings. In these cases, it would be worth accepting the less-than-perfect results in exchange for speed.

Of course, as this technology improves, it could be used for more purposes. For now, the biggest improvement would be editing the translation and improving the content of what’s being said, which is already possible with some other AI dubbing tools, such as Rask.ai.

Will AI replace the need for human transcription and dubbing? 

CEO of VeraContent Shaheen Samavati chats with three professional linguists, Irene Zamora, Lara Luig, Liu Jian (Tom), to discuss the current state of AI video translations and voice overs in this podcast episode.

Our prediction is that, while AI video translation, transcription and dubbing tools are going to take over a large portion of the marketing translation services done by humans—they also open up the door for a lot more content to be translated.

The same way machine translation opened the door for a much larger quantity of content to be translated, these new AI dubbing tools are going to make it possible for massive quantities of multimedia content to be adapted for additional audiences. And they aren’t going to do it all by themselves! Humans are going to need to be involved to give the tools direction, and to revise the output to ensure the localized content truly connects. 

“I don’t see AI working autonomously without any human checking the content anytime soon. As a professional translator, you need to figure out how you can work in parallel with AI translation tools and focus more on the transcreation side of the translation process.” 

Irene Zamora, native Spanish-speaking translator

At VeraContent, we’re constantly learning the best way to use these tools to help our clients reach global audiences in a way that truly resonates.

How our linguists carried out our AI Video Translation test

VeraContent linguists analyzing an AI translation tool

We got our native-English, Spanish, German and Mandarin-speaking linguists to analyze the results of the relevant video translations. 

Each of the linguists gave their feedback on the performance of the tool considering:

  • The accuracy of the translation
  • The tool’s ability to translate expressions
  • The tool’s ability to convey the same feeling as the original video
  • The realism of the video’s lip movements in your language
  • The quality of the tool’s voice-cloning

It performed best on the realism of the video’s lip movements, followed by the quality of the tools’ voice cloning and accuracy of the translation. When it came to the ability to translate expressions and convey the same feeling as the original video, our linguists rated the tool much lower.

See also: Video translation: Auto-generated vs. human translation services

What our linguists were most impressed with

Overall, the tool did really well with adapting the lip movements and cloning the voice. 

“I’d say the best is the realism of the lip movement—it’s quite spot-on! It even matches the Spanish accent of the AI voice.”

Irene Zamora, native Spanish-speaking translator

However, anything covering the speaker’s lips gets  blurred in the translated videos—which is what happened in our test videos. But this simply means that speakers need to be aware of not covering their lips when recording videos. Granted, it’s not as easy if you’re translating videos not initially created for this purpose.

Where the tool can improve

1. Translations

The biggest area where the tool can improve is with translations, especially when it comes to expressions or colloquial language. However, this is common on most machine translation tools.

For example, in the Spanish translation, it translated “This is wild” to “esto es aterrador,” which means “This is scary.” In our human Spanish localization, we translated this line as “¡Que fuerte!” (literally: “How strong!”) which is not an exact translation but expresses the same idea much more effectively, and transmits the same casual tone.

Similarly, the “Listen up!” at the start of the video was incorrectly translated in the German video, where it should have been more colloquial. The emphasis on the use of the word “wild” in German—while not technically incorrect—sounded very unnatural. 

On the Mandarin version, Liu Jian, our native-Mandarin speaking linguist, found the word-for-word translations to be around 80% correct. However, like the other languages, it lacked contextualization and localization

2. Omissions 

The Spanish version had a few omissions from the original video. For example, the original line, “I don’t speak French, German, or Italian” was translated to “No hablo Francés”—leaving out the mention of German and Italian. While we can’t be certain, this may be because the sentences would have been much longer in Spanish, and the final video had to be kept to a certain length. Omitting words could be the AI’s attempt at maintaining the same video length.

“In general, when translating from English to Spanish, we get a lot more text, so it can be a challenge. But the tool omitted the right words as the original message was still there. It omitted words not considered essential.”

Irene Zamora, native-Spanish speaking translator  

3. Tone variations

Another area where the tool can improve is the use of tone variations. While the voice cloning is pretty impressive, it tends to be quite robotic. Shaheen purposefully adjusted her tone throughout, but this wasn’t carried through in the translations, making the video quite flat and void of emotion.

This was particularly evident in the Mandarin version. There are four tones in Mandarin, and if the tone is not correct, it’s quite hard to understand. Liu said that the speaking felt relatively slow and unnatural. 

The AI also added a “bye” in the Spanish and German versions, which wasn’t in the original version. While the word was correct, “Adiós” in Spanish and “Tschüss” in German, the tone was rude and strange, making it another red flag for native-speakers.

4. Accents

The accents were also interesting. With the Spanish translation, the video sounded more like LATAM Spanish but did not particularly match a specific variety of LATAM Spanish.

For the English translation, there are two options when making the video: “your accent” and “American English”, which interestingly had different results considering Shaheen’s natural accent is American.

The translation quality of each was similar. Both were very monotone and didn’t capture Shaheen’s intonation or emphasis on certain words. The accents were different, but neither sounded truly American. The “American” one came across as Australian at points. While the “your accent” version sounded more American, it also had some parts that sounded more like a non-native accent.

See also: Subtitling translation: 5 tips for global video content creators

Other capabilities of HeyGen

So far, we’ve only tested the beta AI video translator tool from HeyGen. But other interesting features from HeyGen include:

  • The ability to use template avatars or generate your own avatars to create talking videos
  • Upload a photo and use it to create talking photos
  • AI script generator 
  • Various video templates, from explainer videos through to social media, advertising, breaking news-style clips, ecommerce, and more

When using any of the above features, the tool allows you to select from different accents and tones, including multiple American, British, Canadian, Australian, Indian and South African accents. For example, Molly is an Australian, middle-aged newscaster voice for news and e-learning and Claire is a cheerful, natural-sounding American voice for ads.

You can also use HeyGen’s internal voice clone feature to train AI on your own voice or integrate with other voice tools, including ElevenLabs and LMNT. The voice cloning featuring in the AI video translate tool that we tested is powered by ElevenLabs.

An example of the different voices available on HeyGen AI tool

See also: Content marketing translation: A fast way to scale

AI tools for video localization: what the future holds

This technology is definitely impressive and is only going to keep improving. We’re looking forward to seeing the many use cases pop up, and the proliferation of translation across all content types. 

However, for now and for the foreseeable future, humans are still needed to really connect with other humans. And that’s why you need VeraContent’s real human linguists to ensure your content resonates with international audiences. Get in touch to see if you qualify for a free content consultation.