Expressive Streaming Digital Humans

12 min

overview the two loops streaming mode enables more expressive and natural digital human presentations by using separate video segments for idle and talking states this advanced mode is specifically designed for streaming digital humans where audio duration is unknown in advance two loops mode creates more engaging digital humans by allowing dynamic transitions between idle gestures and expressive talking animations how two loops streaming works traditional vs two loops architecture traditional streaming mode single idle loop plays continuously talking state uses the same loop with lip sync overlay limited expressiveness during responses two loops streaming mode separate idle loop (0 to cut timestamp) separate talking loop (cut timestamp to end) smooth transitions between states more natural and expressive responses video requirements duration maximum video length 120 seconds structure single continuous recording (no manual cutting required) first half idle state with minimal movement second half expressive talking state natural transition at cut timestamp creating a two loops head visual to learn about how to create head visual via api, please check this page https //docs unith ai/creating head visuals step 1 prepare your video your video should follow these specifications idle state (first half) subject in neutral pose minimal body and head movement subtle, natural gestures only include one blink in the first 4 seconds include another blink between second 4 and cut timestamp avoid noticeable movement in first and last frames of this segment talking state (second half) more expressive facial expressions natural hand gestures and movements animated, engaged body language subject appears actively communicating avoid abrupt movements at segment boundaries the platform automatically handles looping and inversion for both segments to ensure seamless, non jarring transitions step 2 determine cut timestamp the cut timestamp defines where your video transitions from idle to talking state example video duration 20 seconds idle state 0 10 seconds talking state 10 20 seconds cut timestamp 10 guidelines cut timestamp should occur at a natural transition point ensure smooth motion at the cut point typically set at the midpoint of your video for balanced loops measured in seconds from video start step 3 create head visual via api endpoint post https //platform api unith ai/head visual/create request body { "mode" "two loops streaming", "cut timestamp" 10 } curl example curl x 'post' \ 'https //platform api unith ai/head visual/create' \\ h 'accept application/json' \\ h 'x head video token id yourvideotokenid' \\ h 'authorization bearer yourbearertoken' \\ h 'content type application/json' \\ d '{ "mode" "two loops streaming", "cut timestamp" 10 }' parameter type required description mode string yes must be "two loops streaming" cut timestamp number yes timestamp in seconds where idle transitions to talking state video production best practices idle state guidelines movement keep body and head movements minimal subtle weight shifts are acceptable natural breathing motion is encouraged no dramatic gestures or expressions blinking include exactly one blink in the first 4 seconds you can include one additional blink between second 4 and cut timestamp natural blink timing prevents robotic appearance avoid blinking in the first or last 0 5 seconds of the segment talking state guidelines expressiveness engaged facial expressions dynamic body language subject appears actively communicating movement range more animated than idle state natural conversational gestures avoid extreme or distracting movements maintain professionalism appropriate to use case transitions smooth motion at cut timestamp boundary avoid abrupt changes at segment start/end natural flow between states complete workflow example step 1 record video record 10 second video with subject 0 5 seconds subject in neutral waiting pose (idle) 5 10 seconds subject with engaged, helpful expressions (talking) include natural blinks at 2 seconds and 7 seconds step 2 post production follow our best practices https //docs unith ai/video guidelines for avatar creation for video recording export as high quality video file step 3 upload video upload video to unith platform find more info about head visual creation here https //docs unith ai/creating head visuals receive video token id step 4 create head visual curl x 'post' \ 'https //api unith live/head visual/create' \\ h 'accept application/json' \\ h 'x head video token id videotoken' \\ h 'authorization bearer yourbearertoken' \\ h 'content type application/json' \\ d '{ "mode" "two loops streaming", "cut timestamp" 5 }' step 5 configure digital human associate head visual with digital human configure for streaming mode test idle and talking state transitions important notes automatic loop handling the platform automatically manages looping and transitions you do not need to manually reverse, blend, or stitch video segments cut timestamp precision set the cut timestamp at the exact second where your subject transitions from idle to expressive state precision is important for smooth state changes video quality high quality source video is essential ensure proper lighting, clear edges after keying, and consistent framing throughout the recording blink timing strategic blink placement enhances realism include blinks as specified to avoid a static, robotic appearance streaming mode requirement two loops streaming mode only works with streaming digital humans ensure your digital human is configured with streaming true testing always test your two loops head visual with actual conversations to verify smooth transitions and natural appearance performance two loops mode provides better expressiveness without significant performance impact, as loops are preprocessed during video processing