API
How to guides
Generating Videos from Text
9 min
overview this document describes how to use the "text to video" feature to generate videos from text using an existing digital human head the "text to video" feature allows you to create video content by providing text and selecting a voice the system will then generate a video of the digital human speaking the provided text with the chosen voice pre requisites valid api access you must have a valid unith api key and appropriate permissions existing head id you need the unique identifier (head id) of the digital human you want to use for video generation you can obtain this from your unith interface dashboard the "view modal" from the interface dashboard is available for the "doc qa" and "oc" operation modes only regardless of the operation mode, the url for accessing a digital human follows this structure chat unith ai/orgid/headid?api key therefore, even if a digital human is created with the "ttt" operation mode, its head id can be found within this url please refer to create a digital human docid\ hoboropyem9tuenmqcf s prior generating a video of talking digital human process the process involves using two api endpoints /head/text to video to generate the video from text /head/talks/{id} to retrieve the generated video generate a video all digital humans, regardless of their operation mode, can be used used to generate a video (in mp4 format) by providing text and voice as the input to generate an mp4 video with a digital human speaking, the following request is needed against the post head/text to video endpoint 1\ generate video from text endpoint /head/text to video method post description generates a video of the digital human speaking the provided text request headers accept application/json authorization bearer \<yourbearertoken> (replace \<yourbearertoken> with your actual bearer token) content type application/json request body { "id" "yourheadid", // (required) the id of the digital human head "text" "hello world", // (required) the text the digital human will speak } curl x 'post' \\ 'https //platform api unith ai/head/text to video' \\ h 'accept application/json' \\ h 'authorization bearer \<uniquetoken>' \\ h 'content type application/json' \\ d '{ "id" "\<exampleheadid>", "text" "hello world" }'{ "id" "exampleheadid", "text" "hi, my name is john and i'm here to help you learn how to use the unith api ", "voice" "coco" } error handling the api will return standard http error codes for invalid requests 400 bad request indicates an issue with the request, such as an invalid head id or voice name 401 unauthorized indicates an invalid or expired bearer token 2\ retrieve generated video once the above request is made, you need to fetch the videos from from the get /head/talks/{id} endpoint endpoint /head/talks/{id} where {id} is the head id of the digital human method get description retrieves a list of videos generated for a specific digital human head request headers accept application/json authorization bearer \<yourbearertoken> (replace \<yourbearertoken> with your actual bearer token) url parameters order (string, optional) the order in which to return videos use "asc" for ascending or "desc" for descending page (integer, optional) the page number of the results to return take (integer, optional) the number of videos to return per page curl example curl x 'get' \\ 'https //platform api unith ai/head/talks/exampleheadid?order=desc\&page=1\&take=10' \\ h 'accept application/json' \\ h 'authorization bearer string' replace " yourheadid " with your actual head id you can also modify the order , page , and take parameters as needed response { "data" \[ { "id" "videoid", "createdat" "2025 05 15t07 00 19 827z", "updatedat" "2025 05 15t07 00 23 025z", "voice" "voicename", "url" "videourl", "text" "hello" }, { "id" "videoid2", "createdat" "2025 05 15t07 00 53 834z", "updatedat" "2025 05 15t07 00 58 532z", "voice" "voicename", "url" "videourl2", "text" "world" } ], "meta" { "page" "1", "take" "10", "itemcount" 2, "pagecount" 1, "haspreviouspage" false, "hasnextpage" false } } creating a playback digital human digital humans deployed with operationmode "ttt" as defined in configuration parameters docid\ qi6kwaiqafhoaqaan5nkg , has the added benefit of giving you a digital human ui with a digital human ready to repeat any text you pass it to create a text to video digital human, follow the instructions in create a digital human docid\ hoboropyem9tuenmqcf s with "operationmode" "ttt" , as shown below { "headvisualid" "yourheadviualid", "name" "avatarepeat", "alias" "repeater", "languagespeechrecognition" "en us", "langcode" "en us", "ttsprovider" "audiostack", "operationmode" "ttt", "ocprovider" "playground", "ttsvoice" "coco" } response { "id" "\<generatedid>", "publicurl" "https //chat unith ai/\<generatedpath>" }