Pixel art lobster sending an email message — send email from llamaindex agent

how to send email from a llamaindex agent

Learn how to send email from a LlamaIndex agent using GmailToolSpec, then see why production agents need dedicated inboxes instead of personal Gmail.

April 22, 202611 min read

Samuel ChenardCo-founder

LlamaIndex agents can draft and send emails. The standard approach uses Google's Gmail API through the GmailToolSpec tool, which gives your agent access to your personal inbox. It works for prototypes. It falls apart in production for reasons that aren't obvious until you're debugging why Gmail suspended your account at 2am.

This guide walks through both paths: the Gmail tool approach for getting started, and agent-first email infrastructure for when your agents need their own inboxes.

How to send an email from a LlamaIndex agent (step-by-step)#

Before you write a single line of Python, you need a Google Cloud project and OAuth credentials. Here's the full sequence:

Install the LlamaIndex Gmail integration: pip install llama-index-tools-google.
Create a Google Cloud project and enable the Gmail API.
Generate OAuth 2.0 credentials and download the credentials.json file.
Instantiate GmailToolSpec and load the tool list.
Create an OpenAIAgent (or ReActAgent) with the Gmail tools.
Prompt the agent to send an email with a recipient, subject, and body.
Verify delivery in the recipient's inbox and check your Gmail "Sent" folder.

Here's what that looks like in code:

from llama_index.tools.google import GmailToolSpec
from llama_index.agent.openai import OpenAIAgent

tool_spec = GmailToolSpec()
tools = tool_spec.to_tool_list()

agent = OpenAIAgent.from_tools(tools, verbose=True)

agent.chat("Send an email to colleague@example.com with subject 'Weekly sync notes' and body 'Attached are the notes from today's sync.'")

The agent calls send_draft or create_draft under the hood. The GmailToolSpec exposes several tools: load_data (read emails), search_messages, create_draft, update_draft, send_draft, and get_draft. Notice there's no direct send_email tool. The flow is create a draft first, then send it.

If you set verbose=True on the agent, you'll see each tool call logged to the console. This is extremely helpful during development because you can verify exactly which Gmail operations the agent is performing and in what order. Without verbose logging, a failed draft creation can silently break the entire email flow, and you won't see any error unless you inspect the return value manually.

What is the GmailToolSpec in LlamaIndex?#

GmailToolSpec is a LlamaIndex tool specification that wraps the Gmail API. It handles OAuth token management and exposes Gmail operations as callable tools that any LlamaIndex agent can use. You need credentials.json from a Google Cloud project with the Gmail API enabled and the following OAuth scopes: gmail.compose, gmail.readonly, and gmail.send.

The first time you run it, a browser window opens for OAuth consent. After that, a token.json file stores the refresh token locally. This is fine for development on your own machine. It's a problem when your agent runs on a server, in a container, or as part of a multi-agent system where nobody is around to click "Allow."

One detail that trips people up: the OAuth consent screen for unverified apps shows a scary warning to users. Google requires you to go through a verification process if you want to remove that warning, and verification can take weeks. For a personal project where you're the only user, you can click through the warning. For anything involving other users or automated deployments, this becomes a real blocker.

The GmailToolSpec also doesn't handle token refresh failures gracefully. If the refresh token is revoked or the credentials file is deleted, the tool will throw an unhandled exception. Your agent won't retry, won't notify you, and won't attempt to re-authenticate. It just stops working. You'll need to wrap the tool calls in your own error handling to catch these cases.

create_draft vs. send_draft#

LlamaIndex's Gmail integration splits email sending into two steps. create_draft builds the email and saves it in your Gmail drafts folder. send_draft takes a draft ID and actually delivers it. This two-step approach exists because Google's API treats drafts and sent messages as separate operations.

In practice, your agent will chain these calls automatically when you ask it to "send an email." But the distinction matters for error handling. If create_draft succeeds but send_draft fails (rate limit, network issue, revoked token), you'll have orphaned drafts piling up in your Gmail account. Your agent won't know about them unless you build cleanup logic.

Here's a practical example of what can go wrong. Say your agent processes a queue of 50 customer emails overnight. It creates drafts for all 50, then tries to send them one by one. At email 38, Gmail returns a 429 rate-limit error. Emails 39 through 50 are now sitting as unsent drafts. Unless you wrote logic to track which drafts were sent and retry the rest, those 12 customers never get a response. Worse, if the agent runs again the next morning, it might create a second set of drafts for those same 12 customers, and you end up with duplicate unsent drafts mixed into your personal Gmail.

You can mitigate this with a wrapper function that tracks draft IDs and their send status, but at that point you're building email infrastructure on top of a personal email client, which is exactly the wrong direction.

The Gmail ceiling#

Gmail works fine when one agent sends a handful of emails during development. The problems start when you move toward anything resembling production:

Rate limits hit fast. Gmail allows about 500 emails per day for consumer accounts and 2,000 for Google Workspace. Your agent has no way to know it's approaching the limit until it gets a 429 error. At that point, the email is lost unless you've built retry logic.

OAuth tokens expire. The refresh token works until the user revokes access, changes their password, or the Google Cloud project's consent screen goes unverified for too long. When the token breaks, your agent goes silent. No emails sent, no errors surfaced to the end user.

Every agent shares your inbox. If you run three LlamaIndex agents that all use your Gmail, their conversations are mixed together in one inbox. Agent A's customer replies land next to Agent B's vendor follow-ups. There's no isolation, no way to route replies to the right agent, and no audit trail showing which agent sent what.

Credential security is messy. Hardcoding a Gmail app password in a script is a genuine security risk. Storing credentials.json and token.json on disk means anyone with access to that filesystem can impersonate your email. For a local prototype, maybe acceptable. For a deployed agent, not great.

Deliverability degrades unpredictably. Gmail monitors sending patterns closely. If your agent starts sending more emails than usual, or if recipients mark a few messages as spam, Google can throttle your entire account. This doesn't just affect your agent; it affects your personal email too. You might find that your regular emails to colleagues start landing in their spam folders because your agent's automated messages triggered Google's abuse detection.

No programmatic monitoring. Gmail doesn't offer webhook notifications for delivery failures. If an email bounces, the bounce notification arrives as a new email in your inbox. Your agent would need to continuously poll for bounce messages and parse them, which is fragile and unreliable. In a production system, you need delivery status callbacks, not inbox polling.

These aren't theoretical problems. The LlamaIndex GitHub repo has example projects that store credentials as plaintext files. The README acknowledges this is for demonstration purposes, but plenty of people copy-paste demo code into production.

What agent-first email looks like#

The alternative is giving each agent its own email address from the start. No OAuth dance, no shared inbox, no credential files on disk.

With LobsterMail, the agent creates its own inbox at runtime:

const inbox = await lm.createInbox();

await inbox.send({
  to: ['recipient@example.com'],
  subject: 'Weekly sync notes',
  body: { text: 'Here are the notes from today.' },
});

That's it. No credentials.json, no token refresh, no Google Cloud project. The agent gets a dedicated address like inbox-a7x9@lobstermail.ai, and every email it sends or receives is isolated to that inbox. If you have five agents, you have five inboxes. Replies route back to the correct agent automatically.

This matters for LlamaIndex deployments specifically because LlamaIndex agents are often part of larger pipelines. A ReAct agent might read an email, process the content through a RAG pipeline, generate a response, and send a reply. If the email infrastructure breaks mid-pipeline (expired token, rate limit), the entire chain fails. Dedicated inboxes remove that failure point.

The isolation also solves a compliance question that comes up in enterprise deployments. When an agent sends email from your personal Gmail, that email is legally "from you." If the agent hallucinates a commitment, sends incorrect information, or replies to the wrong person, that message carries your name and your email address. With a dedicated agent inbox, the agent's identity is separate from yours. You can still monitor what it sends, but there's a clear boundary between human-sent and agent-sent communication.

Connecting LobsterMail to a LlamaIndex agent#

LlamaIndex's tool system is flexible enough to wrap any API. You can create a custom FunctionTool that calls LobsterMail instead of Gmail:

from llama_index.core.tools import FunctionTool
import requests

def send_email(to: str, subject: str, body: str) -> str:
    """Send an email via LobsterMail."""
    response = requests.post(
        "https://api.lobstermail.ai/v1/inboxes/YOUR_INBOX/messages",
        headers={"Authorization": "Bearer YOUR_API_KEY"},
        json={
            "to": [to],
            "subject": subject,
            "body": {"text": body}
        }
    )
    return f"Email sent: {response.json()['id']}"

email_tool = FunctionTool.from_defaults(fn=send_email)

Pass email_tool to your agent the same way you'd pass the Gmail tools. The agent doesn't care which email provider is behind the function. It just calls the tool when it needs to send a message.

You can also build a receive_email tool that checks the inbox for new messages. LobsterMail's REST API lets you list messages in an inbox, fetch individual messages, and filter by read/unread status. Wrapping these endpoints as FunctionTool instances gives your agent full bidirectional email capabilities without any Gmail dependency.

For teams running multiple agents, you can create a factory function that provisions a new inbox and returns a complete set of email tools (send, receive, list, delete) scoped to that inbox. Each agent gets its own tool set, its own email address, and its own message history. There's no cross-contamination between agents.

The difference shows up in production. No OAuth tokens to rotate. No 500-email daily ceiling (LobsterMail's Builder plan at $9/mo allows 5,000 emails per month with up to 500 per day). No shared inbox confusion. And if an agent misbehaves, you revoke that one inbox without affecting anything else.

Handling replies and threading#

One area where the Gmail approach and the dedicated inbox approach diverge sharply is reply handling. With GmailToolSpec, incoming replies arrive in your personal inbox mixed with everything else. Your agent would need to poll search_messages periodically, filter for replies to messages it sent, and parse the response. This works, but it's polling-based, slow, and noisy.

With LobsterMail, inbound messages trigger webhooks. You configure a URL, and every time someone replies to your agent's email, LobsterMail sends a POST request to your server with the full message payload. Your LlamaIndex agent can process the reply immediately without polling. This is especially useful for agents that need to maintain conversational context across multiple email exchanges.

Threading is handled through inReplyTo and threadId fields. When your agent replies to a message, it includes the original message's ID, and the reply is automatically grouped into the same thread. The recipient sees a normal email conversation in their inbox, not a series of disconnected messages from a bot.

When to use which approach#

Use GmailToolSpec when you're prototyping locally, building a personal assistant agent that sends email as you, or working through a LlamaIndex tutorial. It's the fastest way to get an agent sending email.

Switch to dedicated agent inboxes when you're deploying agents that run unattended, managing multiple agents that each need their own email identity, or sending more than a few emails per day. The setup time is roughly the same (a few lines of code either way), but the operational headaches are vastly different.

Here's a quick comparison to help you decide:

Factor	GmailToolSpec	Dedicated agent inbox
Setup time	10 minutes + Google Cloud project	5 minutes, no cloud console needed
Daily send limit	500 (consumer) or 2,000 (Workspace)	500/day on Builder plan
OAuth required	Yes, with token refresh	No
Agent isolation	None, shared inbox	Full, one inbox per agent
Reply handling	Polling via search_messages	Webhooks, real-time
Credential storage	Files on disk	API key in environment variable
Cost	Free (Gmail), but Workspace is $6+/user/mo	Free tier available; Builder at $9/mo

For most developers who are past the prototyping phase and moving toward deployed agents, dedicated inboxes save hours of debugging and operational maintenance. The initial setup is comparable, but you avoid building the token management, retry logic, and inbox routing that Gmail-based agents eventually require.

If you want your LlamaIndex agent to have its own inbox without the Gmail dependency, . Paste the instructions to your agent and it handles the rest.

Frequently asked questions

How do I install the LlamaIndex Gmail tool and what dependencies are required?

Run pip install llama-index-tools-google. This pulls in the Google API client libraries and the LlamaIndex tool interface. You also need a credentials.json file from a Google Cloud project with the Gmail API enabled.

What OAuth scopes does the LlamaIndex GmailToolSpec need to send email?

GmailToolSpec requires gmail.compose, gmail.readonly, and gmail.send scopes. These are requested during the OAuth consent flow the first time you run the tool.

How is create_draft different from send_draft in the LlamaIndex Gmail tool?

create_draft saves an email to your Gmail drafts folder and returns a draft ID. send_draft takes that ID and delivers the message. The agent chains both calls when you ask it to send an email, but they're separate API operations.

Can I use a service account instead of a personal Gmail to send email from a LlamaIndex agent?

Yes, but only with Google Workspace. Service accounts can impersonate Workspace users via domain-wide delegation. Consumer Gmail accounts don't support service account access for sending.

What happens when a LlamaIndex agent hits Gmail's daily sending limit?

The Gmail API returns a 429 rate-limit error. The agent's tool call fails, and unless you've built retry logic with backoff, the email is dropped. Consumer accounts are limited to around 500 emails per day.

Is it safe to hardcode my Gmail app password in a LlamaIndex script?

No. Anyone with access to your codebase or deployment environment can read and use those credentials. Use environment variables or a secrets manager. Better yet, use an email provider that doesn't require personal credentials.

How can I make a LlamaIndex agent send email without using Gmail?

Wrap any email API (SMTP, transactional provider, or LobsterMail) in a LlamaIndex FunctionTool. The agent calls it the same way it calls Gmail tools. See the code example in this article.

Can a LlamaIndex agent automatically reply to incoming emails?

With Gmail, the agent can poll for new messages using search_messages, then compose a reply. With LobsterMail, inbound emails trigger webhooks so the agent reacts in real time without polling.

What is an agent-first email inbox?

An inbox created by and for an AI agent, without human signup or OAuth. The agent provisions its own email address at runtime and owns it independently. LobsterMail is built around this concept.

How do I give each AI agent its own dedicated email address?

With LobsterMail, each call to createInbox() returns a unique email address. Each agent gets its own isolated inbox, so conversations and replies never cross between agents.

How do I parse inbound email replies so a LlamaIndex agent can continue a thread?

LobsterMail supports threading via inReplyTo and threadId fields. When replying, pass the original message's ID in the inReplyTo parameter, and the reply is automatically linked to the conversation thread.

Can LlamaIndex agents send emails with attachments?

The GmailToolSpec doesn't expose attachment support directly. With LobsterMail, you can include up to 10 base64-encoded attachments per message using the attachments parameter in the send call.

What are the deliverability risks of sending automated emails from Gmail?

Google monitors sending patterns. Sudden spikes, high bounce rates, or spam reports from recipients can trigger account suspension. Gmail wasn't designed for automated agent sending, and Google's abuse detection doesn't distinguish between a bot and a compromised account.

Is LobsterMail free to use?

Yes. The free tier includes receiving emails and up to 1,000 emails per month with no credit card required. Sending requires Tier 1 verification through X (Twitter) or a payment method.