Launch-Free 3 months Builder plan-
Pixel art lobster integrating with an email API — "gmail api" "retry-after" header 429

gmail api 429 errors and the retry-after header: what your agent needs to know

Gmail API 429 rate limit errors kill agent workflows. Here's what the Retry-After header means, why agents hit it, and how to handle it.

10 min read
Ian Bussières
Ian BussièresCTO & Co-founder

Your agent just tried to check its inbox. Gmail responded with HTTP 429: Too Many Requests. The workflow stalled. The verification code it needed expired three minutes later, and the signup flow it was running is now dead.

If you've wired your agent to Gmail's API, you've probably seen this. The 429 status code means your agent exceeded Google's rate limits, and the Retry-After header (when Google bothers to include it) tells you how many seconds to wait before trying again. Sounds simple. In practice, it's one of the most frustrating failure modes in agent email infrastructure.

Let's break down exactly what's happening, why agents trigger this more than human users, and what you can do about it.

There's a faster path: instead of configuring credentials by hand.

What the 429 actually means#

HTTP 429 is defined in RFC 6585. It means the client has sent too many requests in a given time window. Gmail's API enforces per-user and per-project quotas, and when your agent exceeds either one, Google returns a 429 with an optional Retry-After header.

The header value is either a number of seconds or an HTTP-date:

HTTP/1.1 429 Too Many Requests
Retry-After: 30

That means "wait 30 seconds before sending another request." Your agent should parse this and pause accordingly. If the header is missing (which happens more often than you'd expect with Google's APIs), you're left guessing.

Google's documentation recommends exponential backoff with jitter: wait 1 second, then 2, then 4, adding randomness to each interval. The idea is to prevent thundering herd problems when many clients retry simultaneously.

Why agents hit Gmail rate limits constantly#

A human checks email a few times an hour. An agent running an automated workflow might poll every 5 seconds, checking for a verification code or a reply it needs to process. That polling frequency burns through Gmail's quotas fast.

Google allocates roughly 250 quota units per user per second for the Gmail API, but different operations cost different amounts. A messages.list call costs 5 units. A messages.get costs 5 units. A messages.send costs 100 units. If your agent is listing messages, reading them, and sending replies in a tight loop, it can exhaust its quota in seconds.

Here's the part that makes this especially painful for agents: the quota system is per-project, not just per-user. If you have multiple agents sharing the same Google Cloud project (which is common when one developer builds several agents), they all compete for the same rate limit pool. Agent A polling for invoices can starve Agent B waiting for a verification code.

The Gmail API also has daily sending limits. Free Gmail accounts can send about 500 messages per day. Google Workspace accounts get 2,000. Exceed those, and you won't just get a 429. Google may temporarily lock the account's sending capability entirely, with no clear timeline for restoration.

The quota math, spelled out#

Let's walk through a realistic scenario. Say your agent needs to monitor an inbox for incoming verification emails and respond to them automatically. A simple implementation might:

  1. Call messages.list every 5 seconds to check for new mail (5 quota units per call)
  2. Call messages.get for each new message to read the body (5 quota units per call)
  3. Call messages.send to reply or forward the code (100 quota units per call)

At 5-second intervals, that's 12 list calls per minute, costing 60 quota units. If your agent finds one new message per minute and reads it, that's another 5 units. Sending a reply adds 100. So a single agent doing light work consumes roughly 165 quota units per minute. That sounds manageable until you realize Google's quotas are burst-based, not averaged. A sudden flurry of incoming messages can cause your agent to spike well beyond per-second limits, triggering a 429 even though your average usage seems reasonable.

Now multiply this by three or four agents on the same project. Each one polling, reading, and occasionally sending. You can see how the quota pool drains fast. And once one agent triggers the limit, every agent on the project gets blocked.

The real problem isn't the retry; it's the architecture#

Exponential backoff works fine for occasional rate limit hits. But when your agent's core workflow depends on polling Gmail every few seconds, backoff means the agent goes blind. It can't see incoming email during the wait period. For time-sensitive workflows like email verification, two-factor codes, or customer support triage, even a 30-second delay can break things.

Some developers try to work around this with batch requests. Gmail's API supports batching up to 100 calls in a single HTTP request, but each inner call still counts against your quota. Batching reduces HTTP overhead, not quota consumption.

Others try multiple Google Cloud projects to get separate quota pools. This works, technically, but Google's Terms of Service prohibit creating multiple projects specifically to circumvent rate limits. It's the kind of thing that works until it doesn't, and when Google notices, they shut down all the projects at once.

The deeper issue is that Gmail's API was designed for apps that augment human email use. Read a message here, send a reply there, sync a folder in the background. It was not designed for autonomous agents that need to self-provision inboxes, poll continuously, and react to incoming mail in real time. You're fighting the architecture.

Handling 429s properly (if you're sticking with Gmail)#

If you need to keep using the Gmail API, here's how to handle rate limits without breaking your agent's workflow.

Parse the Retry-After header first. Don't assume a fixed backoff. If Google tells you to wait 60 seconds, wait 60 seconds. If the header is absent, start with a 1-second delay and double it on each consecutive 429, up to a cap of 64 seconds.

async function withRetry(fn: () => Promise<Response>, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const res = await fn();
    if (res.status !== 429) return res;

    const retryAfter = res.headers.get('Retry-After');
    const wait = retryAfter
      ? parseInt(retryAfter, 10) * 1000
      : Math.min(1000 * Math.pow(2, attempt), 64000);

    const jitter = Math.random() * 1000;
    await new Promise(r => setTimeout(r, wait + jitter));
  }
  throw new Error('Gmail API rate limit exceeded after max retries');
}
**Reduce polling frequency.** Instead of checking every 5 seconds, use Gmail's push notifications via [Pub/Sub](https://developers.google.com/gmail/api/guides/push). This lets Google notify your agent when new mail arrives, eliminating the need to poll. Setup requires a Google Cloud Pub/Sub topic, a verified webhook endpoint, and a `watch()` call that expires every 7 days and must be renewed. It's not trivial, but it drops your API usage dramatically.

**Separate quota pools per agent.** Give each agent its own Google Cloud project with its own OAuth credentials. This isolates rate limits so one agent can't starve another. Just don't create dozens of projects for a single agent to multiply its quota.

**Cache aggressively.** Store message metadata locally and only fetch new messages using the `historyId` parameter. This avoids re-fetching messages your agent has already processed.

**Implement circuit breakers.** Beyond simple retry logic, add a circuit breaker pattern. If your agent receives three 429s in a row, stop all Gmail calls for a cooldown period (say, two minutes) rather than continuing to hammer the API. This protects your quota from cascading failures and gives Google's rate limiter time to reset.

Common mistakes that make 429s worse#

Beyond the architectural issues, there are several specific anti-patterns that make rate limit problems worse for agents.

Retrying without jitter. If your agent retries at exact exponential intervals (1s, 2s, 4s), and you have multiple agents doing the same thing, they'll all retry at the same moments. This creates synchronized bursts that trigger more 429s. Always add random jitter to your retry delays.

Ignoring partial success. When your agent batches multiple operations and gets a 429 partway through, some operations may have succeeded. If you blindly retry the entire batch, you'll waste quota re-doing work that already completed. Track which operations succeeded and only retry the ones that failed.

Logging too aggressively on errors. Some agent frameworks log every HTTP response body for debugging. With 429 responses, this can create a cascade where the logging itself triggers more API calls (for example, sending error reports via email through the same Gmail API). Make sure your error handling path doesn't depend on the same rate-limited resource.

Not monitoring quota usage proactively. Google provides quota dashboards in the Cloud Console. If your agent is regularly consuming more than 70% of its available quota, you should receive an alert before you hit the ceiling, not after. Set up monitoring to catch rising usage before it becomes a problem.

When the workaround becomes the whole job#

I want to be honest about something. If you're spending more time engineering around Gmail's rate limits than building your agent's actual functionality, the architecture is telling you something.

Gmail's API requires OAuth consent screens, token refresh logic, scope approvals, and Cloud project configuration before your agent can send a single email. Layer rate limit handling on top of that and you've built a significant chunk of infrastructure just to give your agent an inbox.

This is the problem LobsterMail was built to solve. Your agent doesn't need a Gmail account. It needs an email address it can provision and use without human setup, OAuth flows, or rate limit gymnastics.

With LobsterMail, your agent creates its own inbox in one call:

import { LobsterMail } from '@lobsterkit/lobstermail';

const lm = await LobsterMail.create();
const inbox = await lm.createSmartInbox({ name: 'My Agent' });
const emails = await inbox.receive();

No API keys to configure. No OAuth tokens to refresh. No polling quotas to manage. No Retry-After headers to parse. The SDK handles account creation automatically, and inboxes are ready the moment your agent needs them.

The free tier gives you 1,000 emails per month with no credit card required. If your agent needs more volume, the Builder tier at $9/month covers most production workloads. and skip the rate limit dance entirely.

When Gmail is still the right choice#

To be fair, there are cases where Gmail makes sense. If your agent needs to send email from a specific person's existing Gmail address (a founder's personal email, a shared team inbox), you need the Gmail API. LobsterMail gives agents their own addresses at @lobstermail.ai or on your custom domain, but it doesn't act as a proxy for existing Gmail accounts.

If you're operating within Gmail's ecosystem and your volume is low enough that rate limits are rare, the retry logic above will keep things running. The calculus changes when your agent needs to operate autonomously, at scale, without a human refreshing OAuth tokens every time Google decides to revoke them.

For everything else, the simplest fix for a 429 error is not to retry harder. It's to stop depending on an API that wasn't built for your use case.


Frequently asked questions

What does HTTP 429 mean in the Gmail API?

It means your application has exceeded Gmail's rate limits. The server is telling your agent to slow down and retry after a specified delay.

Does Gmail always include a Retry-After header with 429 responses?

No. Google's APIs sometimes omit the Retry-After header on 429 responses. When it's missing, you should implement exponential backoff starting at 1 second.

What are Gmail API rate limits for free accounts?

Free Gmail accounts get roughly 250 quota units per second, with daily sending limits of about 500 messages. Google Workspace accounts get higher limits (2,000 sends per day) but still face per-second quotas.

Can I use multiple Google Cloud projects to avoid Gmail rate limits?

Technically yes, each project gets its own quota pool. But Google's Terms of Service prohibit creating multiple projects specifically to circumvent rate limits. If detected, Google may suspend all projects.

How do I implement exponential backoff for Gmail API 429 errors?

Start with a 1-second wait, double it on each consecutive 429 (2s, 4s, 8s, 16s), and add random jitter to prevent synchronized retries. Cap the maximum wait at around 60 seconds. Always check the Retry-After header first.

What's the difference between per-user and per-project rate limits in Gmail?

Per-user limits cap how many requests a single authenticated user can make. Per-project limits cap total requests across all users in your Google Cloud project. Agents sharing a project compete for the same per-project pool.

Does Gmail Pub/Sub eliminate rate limit issues?

It eliminates polling-related rate limits since Google pushes notifications to you instead. But you still consume quota when fetching message content after receiving a notification, and the watch() subscription expires every 7 days.

Can LobsterMail replace Gmail for agent email?

Yes, for agents that need their own email addresses. LobsterMail lets agents self-provision inboxes without OAuth, API keys, or rate limit management. It won't work if your agent needs to send from an existing personal Gmail address.

Is LobsterMail free?

The free tier is $0/month with 1,000 emails included and no credit card required. The Builder tier at $9/month offers higher volume for production agents.

How does LobsterMail handle rate limiting compared to Gmail?

LobsterMail's SDK manages connection handling and retries internally. Agents don't need to implement backoff logic or parse Retry-After headers. The infrastructure is designed for agent polling patterns, not human email usage.

What happens if my agent exceeds Gmail's daily sending limit?

Google may temporarily lock your account's ability to send email. The lockout duration varies and isn't always communicated clearly. During lockout, your agent can still read email but cannot send any messages.

Can I use the Gmail API Retry-After header value as an exact timer?

You should treat it as a minimum wait time, not exact. Adding random jitter (a few hundred milliseconds to a second) helps prevent all your retries from hitting Google's servers at the same moment.

Related posts