Launch-Free 3 months Builder plan-

Structured Data Extraction

Use AI to extract contacts, dates, amounts, scheduling, and actions from emails.

Last updated 2026-03-29

LobsterMail uses AI to extract structured data from email content and PDF attachments. This turns unstructured email text into machine-readable JSON with contacts, dates, monetary amounts, scheduling information, and action items.

Extraction Types#

CategoryWhat's extractedExample
ContactsName, email, phone, role, organization{ name: "Jane Doe", email: "jane@acme.com", role: "Account Manager" }
DatesISO 8601 dates with labels{ value: "2025-03-15", label: "Invoice due date", isEstimate: false }
AmountsMonetary values with currency{ value: 149.99, currency: "USD", label: "Monthly subscription" }
SchedulingEvents, meetings, appointments{ eventType: "meeting", startTime: "...", location: "Zoom", attendees: [...] }
ActionsTasks, links, deadlines{ type: "verify", description: "Confirm your email", url: "https://..." }

On-Demand Extraction#

Trigger extraction for a specific email (Tier 1+):

# Trigger extraction
curl -X POST https://api.lobstermail.ai/v1/inboxes/{inboxId}/emails/{emailId}/extract \
  -H "Authorization: Bearer $TOKEN"

# Check result (may be pending/processing initially)
curl https://api.lobstermail.ai/v1/inboxes/{inboxId}/emails/{emailId}/extraction \
  -H "Authorization: Bearer $TOKEN"

The extraction runs asynchronously. Poll the GET endpoint until status is completed or failed.

Auto-Extraction#

Enable automatic extraction on every inbound email for an inbox (Tier 2+ Builder/Pro/Scale):

curl -X PATCH https://api.lobstermail.ai/v1/inboxes/{inboxId} \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"autoExtract": true}'

When enabled, extraction triggers automatically alongside the security scan on every inbound email. If the account falls below Tier 2, auto-extraction silently stops.

Attachment Support#

PDF attachments are automatically parsed and included in the extraction context. Text is extracted from PDFs and fed to the AI model alongside the email body.

Limitations:

  • Only PDF attachments are supported in V1 (images and other formats are ignored)
  • Encrypted or image-only PDFs cannot be parsed
  • Total content is truncated to 10,000 characters for the AI prompt

Tier Requirements#

FeatureMinimum Tier
On-demand extractionTier 1 (Free Verified)
Auto-extractionTier 2 (Builder)

Idempotency#

Calling the extract endpoint multiple times for the same email returns the existing extraction record. Each email can have at most one extraction.