Documentation

Automated Testing for Guava Agents

This guide focuses on automated testing. For manual testing, use agent.call_local() to place a test call using your local audio device, or agent.chat() to interact with your agent via text in the terminal.

Guava Agent tests are defined in code. You can write them using your existing testing framework, such as pytest or Python's built-in unittest module. This lets you test your Agent end-to-end alongside your Expert's code, using patterns you're already familiar with.

Test individual Agent handlers with `MockCall`

The most basic way to test Guava Agents is to test individual handlers like on_action_request or on_question. You can do so by constructing a guava.testing.MockCall object and passing it in directly to the handler.

test_agent.py

import unittest
from guava.testing import MockCall

# Import a handler from the help desk example.
from guava.examples.help_desk import on_action_request

class TestHandlers(unittest.TestCase):
    def test_routes_to_sales(self):
        # We can directly invoke that handler with a mocked Call object.
        actions = on_action_request(MockCall(), "I want to buy a new sofa")
        self.assertEqual("sales", actions[0].key)

Run a roleplay session and analyze the result

agent.test_roleplay(prompt="...") runs your agent against a separate LLM that improvises a conversation based on a prompt. The session runs to completion and returns a TestSession object you can assert against. As in a real call, your registered handlers will be invoked by the Agent.

The returned session object contains the transcript, useful helpers that can be asserted against, and a session.evaluate function that evaluates the conversation against a rubric.

test_roleplay.py

import unittest

# Import the agent from the help desk example.
from guava.examples.help_desk import agent

class TestHelpDeskAgent(unittest.TestCase):
    def test_purchase_roleplay(self):
        session = agent.test_roleplay("You are a caller who wants to buy a new dining table.")

        # Use session.get_transcript() to retrieve the session's transcript.
        print(session.get_transcript())

        # Use session.evaluate() to assess the Agent's performance against a rubric.
        session.evaluate(
            # These criteria must all be true.
            pass_criteria=["The agent identified itself as working for Clearfield Home & Living."],
            # These criteria must all be false.
            fail_criteria=["The agent directly offered to search the inventory."],
        )

        # You can assert some values directly on the session object.
        self.assertIn("sales", session.executed_actions)
        self.assertEqual("bot-transfer", session.termination_reason)

Patch Agent handlers before testing

In some cases, you may not want every handler to run during a test. Use agent.patch() to create a clone of the Agent where you can override callbacks without modifying the original Agent.

test_patched_agent.py

import unittest
import guava

# Import the agent from the help desk example.
from guava.examples.help_desk import agent

class TestHelpDeskAgent(unittest.TestCase):
    def test_sales_closed(self):
        # Create a clone of the agent and patch the on_action handler.
        patched = agent.patch()

        @patched.on_action("sales")
        def patched_sales(call: guava.Call):
            call.hangup(
                "Tell the caller that the sales department is closed and that "
                "they should call back tomorrow between 9am and 5pm."
            )

        # Run the roleplay test with our patched agent.
        session = patched.test_roleplay("You are a caller who wants to buy a new dining table.")
        session.evaluate(["The agent informed the caller of the business hours from 9am to 5pm."])

For any function that isn't an Agent handler, you can patch it using Python's builtin patching system.

Run a session with full control

agent.test() gives you a live test session where you have full control over timing and caller utterances. Use this when you need precise control over how the conversation unfolds.

test_help_desk.py

import unittest

# Import the agent from the help desk example.
from guava.examples.help_desk import agent

class TestHelpDeskAgent(unittest.TestCase):
    def test_purchase_routes_to_sales(self):
        with agent.test() as session:
            # Wait until the agent has finished its opening turn.
            session.wait_for_turn()

            # Inject a caller utterance and let the agent respond.
            session.say("Hi, I'm looking to make a new purchase.")

            # Wait until the session ends (transfer, hangup, etc.).
            session.wait_for_end()

        # The TestSession here is the same type as those returned from roleplay sessions.
        # Assert against any of its values, read the transcript, or use 'session.evaluate(...)'
        self.assertIn("sales", session.executed_actions)
        self.assertEqual("bot-transfer", session.termination_reason)

Questions? hi@goguava.ai

import unittest
from guava.testing import MockCall

# Import a handler from the help desk example.
from guava.examples.help_desk import on_action_request

class TestHandlers(unittest.TestCase):
    def test_routes_to_sales(self):
        # We can directly invoke that handler with a mocked Call object.
        actions = on_action_request(MockCall(), "I want to buy a new sofa")
        self.assertEqual("sales", actions[0].key)

import unittest

# Import the agent from the help desk example.
from guava.examples.help_desk import agent

class TestHelpDeskAgent(unittest.TestCase):
    def test_purchase_roleplay(self):
        session = agent.test_roleplay("You are a caller who wants to buy a new dining table.")

        # Use session.get_transcript() to retrieve the session's transcript.
        print(session.get_transcript())

        # Use session.evaluate() to assess the Agent's performance against a rubric.
        session.evaluate(
            # These criteria must all be true.
            pass_criteria=["The agent identified itself as working for Clearfield Home & Living."],
            # These criteria must all be false.
            fail_criteria=["The agent directly offered to search the inventory."],
        )

        # You can assert some values directly on the session object.
        self.assertIn("sales", session.executed_actions)
        self.assertEqual("bot-transfer", session.termination_reason)

import unittest
import guava

# Import the agent from the help desk example.
from guava.examples.help_desk import agent

class TestHelpDeskAgent(unittest.TestCase):
    def test_sales_closed(self):
        # Create a clone of the agent and patch the on_action handler.
        patched = agent.patch()

        @patched.on_action("sales")
        def patched_sales(call: guava.Call):
            call.hangup(
                "Tell the caller that the sales department is closed and that "
                "they should call back tomorrow between 9am and 5pm."
            )

        # Run the roleplay test with our patched agent.
        session = patched.test_roleplay("You are a caller who wants to buy a new dining table.")
        session.evaluate(["The agent informed the caller of the business hours from 9am to 5pm."])

import unittest

# Import the agent from the help desk example.
from guava.examples.help_desk import agent

class TestHelpDeskAgent(unittest.TestCase):
    def test_purchase_routes_to_sales(self):
        with agent.test() as session:
            # Wait until the agent has finished its opening turn.
            session.wait_for_turn()

            # Inject a caller utterance and let the agent respond.
            session.say("Hi, I'm looking to make a new purchase.")

            # Wait until the session ends (transfer, hangup, etc.).
            session.wait_for_end()

        # The TestSession here is the same type as those returned from roleplay sessions.
        # Assert against any of its values, read the transcript, or use 'session.evaluate(...)'
        self.assertIn("sales", session.executed_actions)
        self.assertEqual("bot-transfer", session.termination_reason)

Automated Testing for Guava Agents

Test individual Agent handlers with MockCall

Run a roleplay session and analyze the result

Patch Agent handlers before testing

Run a session with full control

Test individual Agent handlers with `MockCall`