Skip to main content

Documentation Index

Fetch the complete documentation index at: https://documentation.uponai.com/llms.txt

Use this file to discover all available pages before exploring further.

Prompt Engineering

  • Keep prompts concise — longer prompts can harm performance
  • For large knowledge bases, use RAG to filter only relevant information to each query
  • Filler words and slight variation make agents sound more human-like
  • When using function calling, lower temperature improves accuracy
  • To constrain agent behavior, combine internal states (similar to an IVR tree) with different prompts and functions per state
For conversational AI, latency is critical. Chaining multiple LLM calls will hurt the experience.

LLM Selection

Check each provider’s latency and throughput benchmarks. UponAI starts streaming at the first sentence, so:
time to first token + throughput of first sentence = what matters most

Response Style

  • Keep responses short and concise
  • Filler words and controlled variation make agents more human-like
  • Aim for responses that fit naturally into a live phone call context