Launch-Free 3 months Builder plan-
Pixel art lobster working at a computer terminal with email — llamaindex functioncallingagent email

how to use llamaindex functionagent to send real emails

Build a LlamaIndex FunctionAgent that sends actual emails, not dummy print statements. Covers tool definition, error handling, and production patterns.

9 min read
Samuel Chenard
Samuel ChenardCo-founder

Every LlamaIndex email tutorial ends the same way: a dummy function that returns "Successfully sent mail to {to}" and never touches a mail server. That's fine for learning the agent loop. It's useless the moment you need a real email to land in a real inbox.

LlamaIndex FunctionAgent lets you register Python functions as callable tools, wire them to an LLM that supports function calling (GPT-4o, Claude 3, Mistral Large), and let the agent decide when to invoke them. The agent reasons about which tool to use, fills in the arguments, and handles the return value. When one of those tools is send_email, you have an autonomous agent that can compose and deliver messages without human intervention.

The gap between the tutorial version and a production version is bigger than it looks. This post walks through building a FunctionAgent email tool that actually works: real delivery, error handling, retry logic, and observability.

How to use LlamaIndex FunctionAgent to send email#

LlamaIndex FunctionAgent lets you define a Python function as a tool, register it with an agent backed by any function-calling LLM, and let the agent invoke it autonomously. To send email, you define an async send_email tool that calls a transactional email API, then pass it to FunctionAgent alongside your LLM. The agent decides when to call it based on the conversation.

from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI
import httpx

async def send_email(to: str, subject: str, body: str) -> str:
    """Send an email to the given address with subject and body."""
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            "https://api.lobstermail.ai/v1/inboxes/my-inbox/send",
            headers={"Authorization": "Bearer lm_sk_live_xxx"},
            json={"to": to, "subject": subject, "body": body},
        )
        resp.raise_for_status()
    return f"Email sent to {to}"

agent = FunctionAgent(
    tools=[send_email],
    llm=OpenAI(model="gpt-4o-mini"),
    system_prompt="You are a helpful assistant that can send emails.",
)

response = await agent.run("Send a welcome email to hello@example.com")

That's the skeleton. Now let's talk about what breaks when you run this for real.

The dummy function problem#

Open any LlamaIndex agent tutorial and you'll find something like this:

async def send_message(to: str, content: str) -> str:
    """Dummy function to simulate sending an email."""
    return f"Successfully sent mail to {to}"

This teaches you the FunctionAgent pattern but hides every hard problem. There's no network call, no authentication, no error path, and no way to know if your email actually arrived. The Ragas evaluation docs even build an entire test suite around functions like this, measuring whether the agent called the right tool with the right arguments. That's valuable for agent logic. It tells you nothing about email delivery.

The moment you swap in a real email API, three new failure modes appear: network errors, authentication failures, and rate limits. Your tool function needs to handle all of them.

FunctionAgent vs ReActAgent for email tasks#

LlamaIndex offers two main agent types. FunctionAgent uses the LLM's native function calling API (OpenAI's tools parameter, Anthropic's tool use, etc.) to select and invoke tools in a single pass. ReActAgent uses a reasoning loop where the LLM generates thought-action-observation steps in plain text, then parses the action to call a tool.

For email tasks, FunctionAgent is the better choice. Function calling is more reliable for structured actions where you need exact parameter types: email addresses, subject lines, body content. ReActAgent can hallucinate malformed tool calls during the text-parsing step, which is annoying when the action is "send an email to the wrong address."

FunctionAgent also tends to be faster. The LLM returns structured tool calls directly instead of generating reasoning text that needs parsing. For a send-email action where latency matters, that difference adds up.

from llama_index.core.agent.workflow import FunctionAgent, ReActAgent

FunctionAgent: uses native tool calling, more reliable for structured I/O#

email_agent = FunctionAgent( tools=[send_email], llm=OpenAI(model="gpt-4o-mini"), )

ReActAgent: uses text-based reasoning loop, flexible but less predictable#

react_agent = ReActAgent( tools=[send_email], llm=OpenAI(model="gpt-4o-mini"), )


Both work. FunctionAgent gives you fewer surprises when emails are involved.

## Error handling and retries

Here's where most agent email setups fall apart. The LLM calls your `send_email` tool. The API returns a 429 (rate limited) or a 500 (server error). What happens next?

By default, the exception propagates back to the agent as a tool error. The LLM sees the error message and might try again, or it might apologize to the user and give up. Neither response is ideal. You want deterministic retry logic inside the tool itself, not LLM-driven "maybe I'll try again" behavior.

```python
import asyncio
import httpx
import logging

logger = logging.getLogger("email_tool")

async def send_email(to: str, subject: str, body: str) -> str:
    """Send an email. Retries up to 3 times on transient failures."""
    max_retries = 3
    for attempt in range(max_retries):
        try:
            async with httpx.AsyncClient(timeout=10.0) as client:
                resp = await client.post(
                    "https://api.lobstermail.ai/v1/inboxes/my-inbox/send",
                    headers={"Authorization": f"Bearer {API_TOKEN}"},
                    json={"to": to, "subject": subject, "body": body},
                )
                resp.raise_for_status()
                logger.info(f"Sent email to {to} | subject={subject}")
                return f"Email sent to {to}"
        except httpx.HTTPStatusError as e:
            if e.response.status_code == 429 and attempt < max_retries - 1:
                wait = 2 ** attempt
                logger.warning(f"Rate limited, retrying in {wait}s")
                await asyncio.sleep(wait)
                continue
            logger.error(f"Failed to send email: {e}")
            raise
        except httpx.TimeoutException:
            if attempt < max_retries - 1:
                await asyncio.sleep(1)
                continue
            raise
    return "Failed to send after retries"

The logger.info line on every successful send gives you an audit trail. When your agent fires send_email 200 times in a workflow, you need to know exactly what was sent, to whom, and when.

Multi-step email workflows#

Single sends are straightforward. The interesting case is a workflow where the agent sends an email, waits for a reply, then follows up. LlamaIndex's Workflow class lets you orchestrate this as a state machine.

from llama_index.core.workflow import Workflow, step, Event

class EmailSent(Event):
    to: str
    subject: str

class ReplyReceived(Event):
    body: str

class EmailWorkflow(Workflow):
    @step
    async def send_initial(self, ctx, ev) -> EmailSent:
        result = await send_email(
            to="contact@example.com",
            subject="Quick question about your API",
            body="Hi, I'd like to learn more about your pricing.",
        )
        return EmailSent(to="contact@example.com", subject="Quick question")

    @step
    async def wait_for_reply(self, ctx, ev: EmailSent) -> ReplyReceived:

Poll inbox for replies#

for _ in range(30): emails = await check_inbox(filter_from=ev.to) if emails: return ReplyReceived(body=emails[0].body) await asyncio.sleep(60) return ReplyReceived(body="No reply received")

@step async def follow_up(self, ctx, ev: ReplyReceived) -> None: if "No reply" in ev.body: await send_email( to="contact@example.com", subject="Following up", body="Just checking in on my previous message.", )


This is where the gap between dummy functions and real infrastructure becomes obvious. You need an actual inbox to poll for replies. You need delivery guarantees so the initial email lands. You need rate awareness so the follow-up doesn't get throttled.

## Which LLMs work with FunctionAgent

Not every LLM supports function calling. Here's what works with LlamaIndex FunctionAgent as of early 2026:

| LLM provider | Models | Notes |
|---|---|---|
| OpenAI | GPT-4o, GPT-4o-mini, GPT-4 Turbo | Most reliable function calling |
| Anthropic | Claude 3.5, Claude 3 Opus/Sonnet/Haiku | Solid tool use support |
| Mistral | Mistral Large, Mistral Medium | Works well for structured tools |
| Ollama | Llama 3, Mixtral (with tool support) | Local, variable quality |

GPT-4o-mini is the sweet spot for email tasks. It's fast, cheap, and handles structured tool calls reliably. You don't need a frontier model to fill in `to`, `subject`, and `body` fields.

## From tutorial to production

The jump from a dummy `send_message` function to production email involves five concrete changes:

1. **Replace the dummy with an API call.** Use an HTTP client to hit a transactional email API. Raw SMTP connections from agent code are fragile and don't handle bounce processing.

2. **Add retry logic inside the tool.** Don't let the LLM decide whether to retry. Exponential backoff for 429s and 5xx errors, with a hard cap of 3 attempts.

3. **Log every send.** Timestamp, recipient, subject, status code. When something goes wrong at 2 AM, logs are all you have.

4. **Handle rate limits at the infrastructure level.** If your agent can trigger 500 sends in a loop, you need send-rate caps that exist outside the agent's control. LobsterMail enforces per-inbox rate limits on the API side, which means a runaway agent can't burn your domain reputation before you notice.

5. **Use a real inbox for replies.** If your workflow involves receiving email (verification codes, replies, notifications), you need actual inbox infrastructure, not mock data.

If you want your agent to handle email without building all this plumbing yourself, <InlineGetStarted>set up LobsterMail in one click</InlineGetStarted>. Your agent gets a real inbox, send/receive capabilities, and built-in rate limiting through a single SDK call.

## Where to go from here

Start with the FunctionAgent + real API call pattern shown above. Get a single email flowing end-to-end before you add workflows or multi-step sequences. Test with a real recipient address (your own) and verify the email actually arrives in the inbox, not spam.

Once single sends work reliably, build up to the Workflow pattern for sequences that involve waiting for replies. Keep your tool functions thin: they should call an API and return a result, not contain business logic. Let the agent (or the Workflow orchestration) handle decisions.

The LlamaIndex agent framework is good at reasoning about when to use tools. The part it can't solve for you is reliable email delivery. That's an infrastructure problem, and it needs infrastructure solutions.

<FAQ>
  <FAQItem question="What is LlamaIndex FunctionAgent and how does it differ from ReActAgent?">
    FunctionAgent uses the LLM's native function calling API to select and invoke tools in a single structured call. ReActAgent uses a text-based reasoning loop (thought → action → observation) and parses tool calls from generated text. FunctionAgent is more reliable for structured actions like sending emails because it avoids text-parsing errors.
  </FAQItem>
  <FAQItem question="How do I define an email-sending tool for a LlamaIndex FunctionAgent?">
    Write an async Python function with typed parameters (`to: str`, `subject: str`, `body: str`) and a docstring. Pass it to `FunctionAgent(tools=[send_email], llm=your_llm)`. The agent will call it when the conversation requires sending an email.
  </FAQItem>
  <FAQItem question="Which LLMs are compatible with LlamaIndex FunctionAgent?">
    OpenAI (GPT-4o, GPT-4o-mini, GPT-4 Turbo), Anthropic (Claude 3 family), Mistral (Large, Medium), and some local models via Ollama. The LLM must support native function/tool calling in its API.
  </FAQItem>
  <FAQItem question="How do I import FunctionAgent from llama_index?">
    Use `from llama_index.core.agent.workflow import FunctionAgent`. This is the current import path as of LlamaIndex 0.11+. Older versions may use different paths.
  </FAQItem>
  <FAQItem question="Can I use LlamaIndex FunctionAgent with GPT-4o-mini for email tasks?">
    Yes, and it's a good choice. GPT-4o-mini handles structured tool calls reliably, runs faster than full GPT-4o, and costs less. Email tool calls have simple parameter structures that don't require a frontier model.
  </FAQItem>
  <FAQItem question="What happens when a LlamaIndex agent tool raises an exception mid-workflow?">
    The exception is returned to the LLM as a tool error message. The LLM may attempt to retry, apologize, or try a different approach. For email tools, you should handle retries inside the tool function itself rather than relying on the LLM's judgment.
  </FAQItem>
  <FAQItem question="How do I add retry logic to an email tool used inside a LlamaIndex FunctionAgent?">
    Wrap the HTTP call in a loop with exponential backoff. Retry on 429 (rate limit) and 5xx (server error) status codes. Cap retries at 3 attempts. This keeps retry behavior deterministic instead of LLM-driven.
  </FAQItem>
  <FAQItem question="How do I handle email rate limits when a LlamaIndex agent sends at scale?">
    Use an email API that enforces rate limits server-side so a runaway agent can't exceed them. Inside your tool, catch 429 responses and back off. LobsterMail enforces per-inbox send limits at the API level, which protects your domain reputation automatically.
  </FAQItem>
  <FAQItem question="How do I log every email sent by a LlamaIndex FunctionAgent for audit purposes?">
    Add a `logger.info()` call inside your `send_email` tool that records the timestamp, recipient, subject, and response status code. Use Python's built-in `logging` module and write to a persistent log file or logging service.
  </FAQItem>
  <FAQItem question="Can a LlamaIndex FunctionAgent handle multi-step email sequences like send, wait, and follow up?">
    Yes, using LlamaIndex's Workflow class. Define each step (send, poll for reply, follow up) as a `@step` method that emits events. The workflow engine handles state transitions between steps while the agent handles reasoning within each step.
  </FAQItem>
  <FAQItem question="What is the difference between a dummy email function in tutorials and a production email integration?">
    A dummy function returns a hardcoded string with no side effects. A production integration makes an HTTP call to a mail API, handles authentication, retries on failures, logs every send, and deals with bounce notifications. The agent logic is the same; the tool implementation is completely different.
  </FAQItem>
  <FAQItem question="How do I connect a LlamaIndex FunctionAgent to a transactional email API like LobsterMail?">
    Replace the dummy function body with an `httpx` POST request to the email API's send endpoint. Include your API token in the Authorization header and pass `to`, `subject`, and `body` as JSON. The function signature and agent registration stay the same.
  </FAQItem>
  <FAQItem question="Is async support required for email tools used inside LlamaIndex FunctionAgent workflows?">
    It's strongly recommended. LlamaIndex workflows run async by default, and blocking HTTP calls inside a sync tool function will stall the entire event loop. Use `httpx.AsyncClient` or `aiohttp` for the API calls in your email tool.
  </FAQItem>
  <FAQItem question="How do I pass dynamic recipient addresses to an email tool at agent runtime?">
    The LLM extracts recipient addresses from the user's message and passes them as the `to` parameter when calling the tool. Define `to: str` as a parameter in your function signature with a clear docstring, and the function-calling LLM handles argument extraction automatically.
  </FAQItem>
  <FAQItem question="How do I evaluate a FunctionAgent that sends emails without actually sending them?">
    Use a mock transport or test API key that accepts requests but doesn't deliver. LobsterMail test tokens (prefixed `lm_sk_test_`) accept API calls without sending real email. Ragas metrics can then evaluate whether the agent called the tool with correct arguments.
  </FAQItem>
</FAQ>

Related posts