Glossary

LLM (Large Language Model)

Also known as: large language model, language model AI

Definition

A Large Language Model (LLM) is the AI system inside an AI receptionist that understands the caller's words, decides what to say, and generates the response in real time. Examples include Anthropic Claude (RingDispatch's choice), OpenAI GPT, Google Gemini, and Meta Llama.

Why it matters

The LLM is the single biggest determinant of how natural the conversation feels and how well the AI handles ambiguous or unexpected inputs. A cheap or older LLM responds slowly, sounds robotic, and frequently misunderstands intent. A top-tier LLM (Claude 4.x, GPT-4.x as of 2026) handles scope-edge calls gracefully, recognizes urgency keywords, and switches languages mid-conversation. The LLM also drives the per-call cost: better models cost more per token, which is why per-call cost ceiling matters as a guardrail.

How it works

When a caller speaks, their audio is transcribed via a speech-to-text engine. The transcript + the business's configured knowledge (services, pricing, hours, team) is sent to the LLM as a prompt. The LLM streams back a response, which is converted to speech via a text-to-speech engine and played to the caller. This loop happens 5-15 times during a typical 2-minute call. RingDispatch uses Anthropic Claude with prompt-caching to keep latency under 800ms per turn and cost per call under $0.20.

Examples

Caller says 'my dog ate chocolate and is shaking' — the LLM recognizes urgency, books the next available vet emergency slot, and pages on-call staff.
Caller switches mid-sentence from English to Spanish — the LLM detects the language shift and continues in Spanish without a transfer.
Caller asks about a service the business doesn't offer — the LLM politely declines and suggests an alternative.

Why it matters

How it works

Examples

Related