Documentation Index
Fetch the complete documentation index at: https://documentation.uponai.com/llms.txt
Use this file to discover all available pages before exploring further.
Prompt Engineering
- Keep prompts concise — longer prompts can harm performance
- For large knowledge bases, use RAG to filter only relevant information to each query
- Filler words and slight variation make agents sound more human-like
- When using function calling, lower temperature improves accuracy
- To constrain agent behavior, combine internal states (similar to an IVR tree) with different prompts and functions per state
For conversational AI, latency is critical. Chaining multiple LLM calls will hurt the experience.
LLM Selection
Check each provider’s latency and throughput benchmarks. UponAI starts streaming at the first sentence, so:time to first token + throughput of first sentence = what matters most
Response Style
- Keep responses short and concise
- Filler words and controlled variation make agents more human-like
- Aim for responses that fit naturally into a live phone call context