Semantic Cache for Digital Human Responses

7min

ithis document explains how to leverage the unith digital human platform's caching mechanisms, including the advanced semantic cache, to deliver faster and more efficient responses overview the caching layer mechanism significantly enhances the responsiveness of your digital human by storing and reusing previously generated responses this means that as your digital human is used more frequently, its ability to deliver instantaneous responses improves, bypassing the full synthesis pipeline by default, your digital human is set to retrieve cached responses for exact match inquiries default caching exact match by default, the digital human platform supports caching for exact user inquiries this means if two or more users ask exactly the same question (character for character), the system will only generate the response once for all subsequent identical queries, the system will retrieve the already generated response from the cache this process effectively bypasses the entire speech synthesis and video generation pipeline, resulting in an instantaneous response delivery enhancing responsiveness with semantic cache while exact match caching is effective for identical queries, real world user interactions often involve variations in phrasing for the same underlying intent the semantic cache feature extends this capability by triggering the same response mechanism for different user queries that share the same semantic meaning , even if their phrasing is not identical how semantic cache works the semantic cache analyzes the meaning (semantics) of user queries when a new query comes in, the system compares its semantic meaning to previously cached queries if the semantic similarity is above a configurable threshold, the cached response for the semantically similar query is delivered configuring the semantic cache threshold you can set a semantic cache threshold value to control the level of semantic similarity required to trigger a cached response this threshold is a crucial parameter that balances response speed with contextual accuracy lower threshold value (higher precision) setting a lower threshold (e g , closer to 0) means the system requires a very high degree of semantic similarity between queries this results in a higher probability that the cached response is highly relevant and adequate for the new user query however, it may lead to fewer cache hits, as queries need to be very close in meaning higher threshold value (higher cache hits) setting a higher threshold (e g , closer to 1) allows for a broader range of semantic similarity this will most likely result in a higher number of instantaneous responses, as more varied queries will trigger cache hits however, it carries an increased risk that the semantic meaning between two user queries might be low, potentially leading to a cached response that is less relevant or even inadequate for the new query the semantic cache threshold is a crucial parameter that balances response speed with contextual accuracy lower value = higher precision you can set values from 0 1 when setting semantic cache to do so, please use the following endpoint /head/update/ together with your existing head id curl x 'put' \\ 'https //platform api unith ai/head/update' \\ h 'accept application/json' \\ h 'authorization bearer yourbearertoken' \\ h 'content type application/json' \\ d '{ "id" "yourheadid", "semanticthreshold" 0 3 }' here's a breakdown of how different threshold ranges typically correspond to semantic similarity distance 0 0 0 01 nearly identical sentences are essentially saying the same thing with minor word variations example "the cat is sleeping" vs "a cat is asleep" distance 0 01 0 05 very similar same core meaning with some structural or vocabulary differences example "i love pizza" vs "pizza is my favorite food" distance 0 05 0 1 moderately similar related topics or themes but different specific focus example "the weather is sunny today" vs "it's a beautiful day outside" distance 0 2 0 3 somewhat similar share some conceptual overlap but clearly different meanings example “i’m cooking dinner” vs “the restaurant serves great food” distance 0 5 0 8 weakly related some semantic connection, perhaps sharing a broad category example “my dog is barking” vs “i heard music playing”

Configuration Parameters

Advanced Conversational Settings (LLM Provider, external tools)