Text-to-Speech
Text-to-speech is in Single Actor projects

Inside an EasyDub Project you now have the ability to replace dialogue in your videos using AI-generated speech, making it easier than ever to customize content directly in the platform.
Character limit: You can add up to 1,000 characters of text.
Language: Currently, only English is supported (more languages are on the way!).
Video input: Static images won’t lip sync. You need to upload a video showing speaker articulation to properly train the model.
Output length: You can only generate a video that matches the length of the video you uploaded.
LipDub AI does not create new video frames.
This is just the first iteration of our text-to-speech feature, and we’re actively working to enhance it in the near future—including support for more languages
HARD REQUIREMENT
- The video that you upload must have audio. LipDub cannot voice clone a video without any audio.
- For Best results please upload at least a 1min video. This will ensure the platform will have enough data to create a similar sounding voice.
SOFT REQUIREMENT
- For best results please ensure the original video is voice isolated (i.e. no background noise, or music which may interfere with the voice clone)
FAQ:
What if I upload a video with 4 actors that are speaking?
-
- Since LipDub EasyDub project is built for single person videos. Uploading a video like this may impact the quality of the result.