Converting Legacy Digital Humans to Streaming Mode

12 min

overview digital humans created in legacy mode can be upgraded to streaming mode for improved performance and reduced latency streaming mode delivers audio and video in real time as content is generated, providing a more natural and responsive user experience streaming mode requires a tts provider that supports audio streaming currently supported providers are microsoft azure and elevenlabs prerequisites before converting your digital human to streaming mode, ensure you have active unith account with api access valid bearer token for authentication your digital human's head id (can be obtained via interface or https //docs unith ai/update a digital human ) a voice from a streaming compatible tts provider (microsoft azure or elevenlabs) legacy vs streaming mode legacy mode characteristics uses text splitting to deliver responses in chunks pre generates complete audio before video synthesis accessible via https //chat unith ai/{org id}/{head id} streaming mode characteristics delivers audio and video in real time as content is generated lower latency for first response requires streaming compatible tts provider accessible via https //stream unith ai/{org id}/{head id} text splitting must be disabled conversion process converting from legacy to streaming mode requires three configuration changes and a url update follow these steps in order step 1 configure streaming compatible voice select a voice from a tts provider that supports streaming currently supported providers are microsoft azure and elevenlabs endpoint put https //platform api unith ai/head/update request body { "id" "yourheadid", "ttsprovider" "elevenlabs", "ttsvoice" "voiceid" } curl x 'put' \\ 'https //platform api unith ai/head/update' \\ h 'accept application/json' \\ h 'authorization bearer yourbearertoken' \\ h 'content type application/json' \\ d '{ "id" "yourheadid", "ttsprovider" "elevenlabs", "ttsvoice" "rachel" }' for optimal streaming performance, refer to the https //docs unith ai/voice selection guide tts for recommended voices elevenlabs voices using flash v2, flash v2 5, turbo v2, or turbo v2 5 models are recommended for streaming step 2 enable videostreaming by enabling videostreaming you will trigger a series of checks and processes, including text splitting behaviour adjustments legacy mode uses text splitting to deliver messages in chunks, but streaming mode handles content delivery differently endpoint put https //platform api unith ai/head/yourheadid/video streaming?videostreaming=true query parameter curl example curl x 'put' \\ 'https //platform api unith ai/head/yourheadid/video streaming?videostreaming=true' \\ h 'accept / ' \\ h 'authorization bearer yourbearerkey' converting to a streaming digital human is only possible if your head visual is in default mode if you used the expressive mode for your digital human, your head visual used the two loops mode, and conversion to streaming is not possible in this case, you'll need to create a new digital human note you can only convert streaming digital humans back to legacy digital humans if the head visual is using default mode to retrieve the mode of head visual follow these steps \# 1 get head visual id from head id resource curl x 'get' \\ 'https //platform api unith ai/head/yourheadid' \\ h 'accept application/json' \\ h 'authorization bearer yourbearertoken' \# 2 get head visual mode using head visual id curl x 'get' \\ 'https //platform api unith ai/head visual/yourheadvisualid' \\ h 'accept application/json' \\ h 'authorization bearer yourbearertoken' text splitting ( splitter=true ) is incompatible with streaming mode by setting videostreaming to true, the text splitter will automatically be disabled enabling text splitter will forcefully turn streaming digital human back to legacy step 3 update access url api will return new streaming url legacy chat unith will no longer be accesible url format change true 220,220,221left unhandled content type left unhandled content type left unhandled content type left unhandled content type left unhandled content type left unhandled content type left unhandled content type left unhandled content type left unhandled content type simply replace chat with stream in your digital human's url to access streaming mode after completing the configuration steps complete conversion example this example demonstrates the full conversion process from legacy to streaming mode \# step 1 configure streaming compatible voice curl x 'put' \\ 'https //platform api unith ai/head/update' \\ h 'accept application/json' \\ h 'authorization bearer yourbearertoken' \\ h 'content type application/json' \\ d '{ "id" "headid", "ttsprovider" "ttsprovider", "ttsvoice" "ttsvoice" }' \# step 2 enable videostreaming parameter curl x 'put' \\ 'https //platform api unith ai/head/yourheadid/video streaming?videostreaming=true' \\ h 'accept application/json' \\ h 'authorization bearer yourbearertoken' \# step 3 access your digital human at the streaming url \# https //stream unith ai/yourorgid/yourheadid embed integration considerations when converting to streaming mode, update your embed configurations accordingly to learn more please visit our embeding guideline https //docs unith ai/embedding streaming digital humans default important notes tts provider compatibility only microsoft azure and elevenlabs support streaming mode other providers will return an error if you attempt to enable streaming text splitter incompatibility the text splitter (splitter=true) cannot be used in streaming mode these features are mutually exclusive voice selection not all voices from streaming compatible providers support streaming refer to the https //docs unith ai/voice selection guide tts for recommended streaming voices configuration order follow the conversion steps in the exact order provided to avoid configuration errors the platform validates state transitions to prevent invalid configurations url access your digital human must be accessed via the correct url for its mode legacy mode uses chat unith ai while streaming mode uses stream unith ai api validation the platform enforces configuration rules at the api level invalid state transitions will return descriptive error messages to guide proper configuration