Streaming Avatars Quickstart

18 min

realtime avatars via the unith platform api — quickstart this is an early soft launch of our new streaming solution, currently available only to api users for this alpha release, several features have been temporarily removed — including the welcome message, alias, suggestions, image and link handling — to focus on the core real time conversational experience this is input into an upcoming full frontend and ux overhaul, scheduled for completion by the end of the year (2025) this guide shows you how to create a streaming digital human (avatar) directly from the unith api using your own account it covers authentication, picking a head visual, choosing a tts voice, creating the head, and deriving the stream url prerequisites api access & your email address (registered on unith) your secret key (from manage account → secret key) base url https //platform api unith ai tools curl or any http client 1\) authenticate → get a bearer token bearer tokens authenticate all api requests and expire after 7 days request curl x post "https //platform api unith ai/auth/token" \\ h "accept application/json" \\ h "content type application/json" \\ d '{ "email" "you\@company com", "secretkey" "sk live " }' response (one of the shapes below) { "token" "eyjhbgcioi " } or { "data" { "bearer" "eyjhbgcioi " } } use the returned token in authorization bearer \<token> for all subsequent requests 2\) discover head visuals (faces) list head visuals your org can use you can filter and page; a practical pattern is to fetch a larger page (e g , 50) and do client side pagination in your ui for streaming avatars, you can only use head visuals that were created without gestures those that only contains a single silent idle request (50 faces, optional gender filter) \# gender can be male or female; omit it to fetch all curl x get "https //platform api unith ai/head visual/face/all?order=asc\&page=1\&take=50\&gender=female" \\ h "accept application/json" \\ h "authorization bearer \<token>" response (excerpt) { "data" \[ { "id" "ec4326ba b06a 474d bd08 ab96b42f06a3", "name" "aiko", "gender" "female", "avatar" "https // /thumb bmp", "videourl" "https // /idle mp4", "permissiontype" "private", "type" "talk" } // ], "meta" { "page" 1, "take" 50, "itemcount" 1635, "pagecount" 33, "hasnextpage" true } } tip display avatar (or fallback to posterimage or videourl) as the thumbnail in your ui 3\) choose a tts voice (provider elevenlabs) fetch the unified voice catalog filtered by provider use the voiceid (not the display name) when creating the avatar request curl x get "https //platform api unith ai/voice/all?provider=elevenlabs" \\ h "accept application/json" \\ h "authorization bearer \<token>" response (excerpt) { "data" { "voices" \[ { "displayname" "roger eleven v2 flash", "voiceid" "cwhrbwxzgahq8tq4fs17 eleven v2 flash", "locale" "en us", "language" "english", "gender" "male", "provider" "elevenlabs" } // ] } } currently, elevenlabs is the only support voice provider for streaming avatars 4\) create a streaming digital human (open conversation) to create a digital human, you must follow the same instructions as listed here docid\ hoboropyem9tuenmqcf s but where the voice provider is elevenlabs, the greetings (welcome message) is an empty string "" you must select headvisualid from step 2 ttsprovider and ttsvoice (voiceid) from step 3 operationmode can be any of unith's supported operations modes please see docid\ qi6kwaiqafhoaqaan5nkg greetings must be an empty string request curl x post "https //platform api unith ai/head/create" \\ h "accept application/json" \\ h "content type application/json" \\ h "authorization bearer \<token>" \\ d '{ "headvisualid" "ec4326ba b06a 474d bd08 ab96b42f06a3", "name" "jester", "languagespeechrecognition" "en us", "language" "en us", "operationmode" "oc", "ttsprovider" "elevenlabs", "ttsvoice" "cwhrbwxzgahq8tq4fs17 eleven v2 flash", "greetings" "", "promptconfig" { "system prompt" "you are helpful and concise " } }' response (excerpt) { "id" "head 12345", "publicid" "head 12345", "publicurl" "https //chat unith ai/\<org id>/\<head id>?api key=\<org api key>" } for elevenlabs, if you pass a voice name instead of a voiceid, you may get an error always send ttsvoice \<voiceid> 5\) disable the splitter (required for streaming) after the head is created, turn off the splitter so the head can stream conversationally curl x put "https //platform api unith ai/head/\<head id>/splitter?splitter=false" \\ h "accept application/json" \\ h "authorization bearer \<token>" expect a 200 ok or equivalent confirmation 6\) retrieve details → build your stream url get the head to obtain publicurl and derive the stream url curl x get "https //platform api unith ai/head/\<head id>" \\ h "accept application/json" \\ h "authorization bearer \<token>" response (excerpt) { "publicid" "\<head id>", "publicurl" "https //chat unith ai/\<org id>/\<head id>?api key=\<org api key>" } streaming url format https //stream unith ai/\<org id>/\<head id>?api key=\<org api key> open that link in the browser to interact with your streaming avatar keep this url safe — it includes your org api key in the query string example minimal js (fetch) flow const base = 'https //platform api unith ai'; async function api(path, { method='get', token, body } = {}) { const res = await fetch(`${base}${path}`, { method, headers { 'accept' 'application/json', (method !== 'get' ? { 'content type' 'application/json' } {}), (token ? { 'authorization' `bearer ${token}` } {}), }, (body ? { body json stringify(body) } {}) }); if (!res ok) throw new error(`http ${res status} — ${await res text()}`); return res json(); } (async () => { // 1) auth const a = await api('/auth/token', { method 'post', body { email 'you\@company com', secretkey 'sk live ' } }); const token = a? token || a? data? bearer; // 2) faces const faces = await api('/head visual/face/all?order=asc\&page=1\&take=50\&gender=female', { token }); const headvisualid = faces data\[0] id; // 3) voices (elevenlabs) const voices = await api('/voice/all?provider=elevenlabs', { token }); const voiceid = (voices? data? voices || \[])\[0] voiceid; // 4) create const created = await api('/head/create', { method 'post', token, body { headvisualid, alias 'my assistant', name 'my assistant', languagespeechrecognition 'en us', language 'en us', operationmode 'oc', ttsprovider 'elevenlabs', ttsvoice voiceid, greetings '', promptconfig { system prompt 'you are helpful and concise ' } } }); const headid = created id || created publicid; // 5) disable splitter await api(`/head/${headid}/splitter?splitter=false`, { method 'put', token }); // 6) details → stream url const details = await api(`/head/${headid}`, { token }); const u = new url(details publicurl); const \[ , org, head ] = u pathname split('/'); // \['', orgid, headid] const apikey = u searchparams get('api key'); const streamurl = `https //stream unith ai/${org}/${head}?api key=${apikey}`; console log('open ', streamurl); })(); troubleshooting 403 forbidden on faces/voices your token may belong to a different org or lacks permissions if you have multiple orgs, include "orgid" "\<org id>" when creating a head ensure your org is active and has the correct subscription 404 voice not available you passed a voice name instead of a voiceid ; fetch /voice/all?provider=elevenlabs and send ttsvoice "\<voiceid>" always getting page 1 use head visual/face/all (not /head visual/all) if server side paging is inconsistent, fetch a larger take (e g , 50) and paginate client side no token returned the /auth/token response shape may be { data { bearer " " } }; read both token and data bearer also double check your email + secret key security best practices never store or hardcode your secret key in public repos the bearer token expires in 7 days — refresh via /auth/token treat the stream url as sensitive (it contains your org api key in the querystring) optional other operation modes text to video (operationmode "ttt") generates an mp4 from text doc qa uses your uploaded knowledge base (requires additional document upload step) plugin connects a custom conversational engine via webhook voiceflow integrates a voiceflow conversation (requires voiceflowapikey) the core creation call is the same endpoint (/head/create), with mode specific fields reference endpoints post /auth/token — obtain bearer get /head visual/face/all — list available head visuals (supports order, page, take, gender=male|female) get /voice/all?provider=elevenlabs — list voices for a provider post /head/create — create a digital human put /head/{id}/splitter?splitter=false — enable streaming get /head/{id} — retrieve publicurl → derive streaming url