import { NextLink, Callout } from '../views/docs/prose';

## Architecture Overview

Every Guava call involves two systems running simultaneously: Guava's hosted **Dialog System** and your **Expert** — a small, long-running service that connects to Guava's API and steers the conversation.

<svg viewBox="0 0 760 285" xmlns="http://www.w3.org/2000/svg" className="w-full my-8">
  <defs>
    <marker id="ahd" markerWidth="8" markerHeight="7" refX="7" refY="3.5" orient="auto">
      <polygon points="0 0, 8 3.5, 0 7" fill="rgba(255,255,255,0.25)"/>
    </marker>
  </defs>

  
  <rect x="144" y="47" width="256" height="130" rx="14" fill="#1e1f21" stroke="rgba(255,255,255,0.08)" strokeWidth="1.5"/>
  <text x="272" y="65" textAnchor="middle" fill="#555558" fontSize="9" fontFamily="ui-monospace, monospace" letterSpacing="2">GUAVA CLOUD</text>

  
  <rect x="160" y="72" width="224" height="82" rx="8" fill="#272728" stroke="rgba(26,147,254,0.3)" strokeWidth="1.5"/>
  <text x="272" y="110" textAnchor="middle" fill="#1a93fe" fontSize="14" fontFamily="ui-monospace, monospace" fontWeight="700">Dialog System</text>
  <text x="272" y="132" textAnchor="middle" fill="#555558" fontSize="10" fontFamily="ui-monospace, monospace">audio · STT · LLM · TTS</text>

  
  <circle cx="65" cy="115" r="32" fill="#272728" stroke="rgba(255,255,255,0.1)" strokeWidth="1.5"/>
  
  <path d="M 51,113 C 51,103 58,97 65,97 C 72,97 79,103 79,113" fill="none" stroke="#dadada" strokeWidth="1.8" strokeLinecap="round"/>
  <rect x="47" y="113" width="8" height="12" rx="4" fill="#272728" stroke="#dadada" strokeWidth="1.5"/>
  <rect x="75" y="113" width="8" height="12" rx="4" fill="#272728" stroke="#dadada" strokeWidth="1.5"/>
  <text x="65" y="168" textAnchor="middle" fill="#acacac" fontSize="11" fontFamily="ui-monospace, monospace">Caller</text>

  
  <line x1="104" y1="115" x2="137" y2="115" stroke="#acacac" strokeWidth="1.5"/>
  <polygon points="97,115 104,112 104,118" fill="#acacac"/>
  <polygon points="144,115 137,112 137,118" fill="#acacac"/>
  <text x="120" y="108" textAnchor="middle" fill="#555558" fontSize="9" fontFamily="ui-monospace, monospace">audio</text>

  
  <line x1="407" y1="115" x2="458" y2="115" stroke="#acacac" strokeWidth="1.5"/>
  <polygon points="400,115 407,112 407,118" fill="#acacac"/>
  <polygon points="465,115 458,112 458,118" fill="#acacac"/>
  <text x="432" y="108" textAnchor="middle" fill="#555558" fontSize="9" fontFamily="ui-monospace, monospace">WebSocket</text>

  
  <rect x="465" y="72" width="186" height="82" rx="8" fill="#272728" stroke="rgba(255,255,255,0.12)" strokeWidth="1.5"/>
  <text x="558" y="108" textAnchor="middle" fill="#dadada" fontSize="13" fontFamily="ui-monospace, monospace" fontWeight="700">Your Expert</text>
  <text x="558" y="130" textAnchor="middle" fill="#555558" fontSize="10" fontFamily="ui-monospace, monospace">Python · TypeScript · ...</text>

  
  <line x1="558" y1="154" x2="465" y2="207" stroke="rgba(255,255,255,0.2)" strokeWidth="1.2" strokeDasharray="5 3" markerEnd="url(#ahd)"/>
  <line x1="558" y1="154" x2="650" y2="207" stroke="rgba(255,255,255,0.2)" strokeWidth="1.2" strokeDasharray="5 3" markerEnd="url(#ahd)"/>

  
  <rect x="385" y="210" width="160" height="56" rx="8" fill="#1e1f21" stroke="rgba(255,255,255,0.1)" strokeWidth="1"/>
  <text x="465" y="234" textAnchor="middle" fill="#dadada" fontSize="11" fontFamily="ui-monospace, monospace" fontWeight="600">Your Infrastructure</text>
  <text x="465" y="252" textAnchor="middle" fill="#555558" fontSize="9.5" fontFamily="ui-monospace, monospace">local or self-hosted</text>

  
  <text x="559" y="242" textAnchor="middle" fill="#555558" fontSize="9" fontFamily="ui-monospace, monospace">or</text>

  
  <rect x="573" y="210" width="152" height="56" rx="8" fill="#1e1f21" stroke="rgba(26,147,254,0.35)" strokeWidth="1.5"/>
  <text x="649" y="234" textAnchor="middle" fill="#dadada" fontSize="11" fontFamily="ui-monospace, monospace" fontWeight="600">Guava Hosting</text>
  <text x="649" y="252" textAnchor="middle" fill="#555558" fontSize="9.5" fontFamily="ui-monospace, monospace">managed by Guava</text>
</svg>

### The Dialog System

The Dialog System is Guava's managed service running in the cloud. It handles everything time-sensitive during a call: receiving the caller's audio, running speech-to-text, querying the language model, synthesizing the response, and streaming it back to the caller.

Because the entire pipeline runs as a fully integrated architecture rather than a chain of off-the-shelf APIs, the Dialog System delivers best-in-class latency and naturalness. Callers hear a response that feels like a real conversation, not a chatbot reading from a script.

### Your Expert

Your Expert is the code you write. Using the [Guava SDK](/docs/agent) (Python or TypeScript), it connects to the Dialog System over a persistent WebSocket and steers the agent in real time — setting its persona, sending mid-call instructions, responding to events like [`on_question`](/docs/on-question) or [`on_action`](/docs/on-action-request-execute), and issuing commands like [`transfer`](/docs/transfer) or [`hangup`](/docs/hangup).

Because your Expert is just code, you can do anything: call your CRM, query a database, hit an external API, or chain into another specialized AI sub-agent. For the most common patterns — intent detection, document Q&A, vector search — Guava ships a [helper library](/docs/intent-helpers) so you can get up and running fast without reinventing the wheel.

<Callout>
  Your Expert is not in the latency-critical path. The Dialog System handles all real-time audio processing independently — your Expert can spend time on complex reasoning, external API calls, or chaining multiple AI models without the caller ever noticing a pause.
</Callout>

During development, your Expert runs on your local machine, and Guava routes calls to it directly. You can rapidly iterate by changing the code and restarting the process — no public web server or ngrok required.

### Deployment

When it's time to move to production, you'll want your Expert deployed in a highly-available configuration, running continuously and ready to handle calls at any time. Because Guava Experts only make outbound connections, it's easy to run an Expert behind a NAT or firewall.

We recommend running multiple instances of the same Expert. Guava round-robins new calls across connected Experts, giving you horizontal scaling and redundancy by default. If an Expert instance dies mid-call, Guava will attempt to hand the call off to another active instance — which means you should keep in-memory state to a minimum and design your Expert to be stateless where possible.

You have two options for deploying your Expert:

- **Your Infrastructure** — deploy to your own servers, VM, or serverless compute platform. You control the environment.
- **Guava Hosting** — push your Expert with a single [`guava deploy`](/docs/cli-reference) command and Guava manages the rest.

See the [Deployment guide](/docs/deployment) for a full walkthrough of both options.

### What to read next

The [Quickstart](/docs/quickstart) walks you through a complete working example in minutes. Once you're comfortable with the basics, the [SDK Reference](/docs/runner) covers every callback and call command in detail. If you want to see a real-world use case before diving into reference docs, the [example walkthroughs](/docs/inbound-rag-example) show full Expert implementations for common scenarios.

<NextLink section="quickstart" label="Quickstart" />
