How are AI Document Parser analyzers different from the older Parse Document templates?

The older Parse Document templates use Regex Expression or JavaScript Expression on drawn capture areas of a sample PDF. AI Document Parser analyzers use a JSON Document Schema with field descriptions in natural language; the AI engine reads the document semantically rather than by fixed regions. No sample PDF is required to set up an Analyzer.

What is fieldDescription used for in the schema?

fieldDescription is the natural-language hint the AI uses to find that field in the document. Be specific: 'Sales Order Number' is better than 'order'. Mention alternate names too, e.g. 'Name of the agent, sometimes given as Agent Name or Sales Rep'. Better descriptions give better extraction accuracy.

AI Document Parser using Parse

Q: What is the difference between Parse and Classify in AI Document Parser?

Parse extracts structured field values from documents. You define a Document Schema with fields and descriptions, and the AI returns a JSON object with one value per field. Classify routes a document into one of several categories. Use Parse when you need data; use Classify when you need a label.

Q: Does the Analyzer Id change after Save Changes?

No. The Analyzer Id is the string you typed in the Add row. It stays stable across edits and is the value you reference in API calls and automation modules. Pick a clear, consistent naming convention up front (for example invoice_parser_v1, purchase_order_parser).

What this guide covers

AI Document Parser. Parse is the dashboard setup that turns a JSON Document Schema into AI-powered structured PDF extraction. You create an Analyzer, pick the Parse type, and describe the fields you want as fieldName, fieldType, and fieldDescription. The AI reads the document semantically, so no drawn capture areas or regex are required. Once saved, the same Analyzer Id runs from the REST API, Make, Zapier, Power Automate, and n8n.

Authenticating Your Setup

AI Document Parser analyzers are created in the PDF4me developer dashboard. Sign in with your account, then create or copy an API key for the AI Parser API calls that reference the analyzer you build here.

Open AI Document Parser Dashboard Get Your API Key

Dashboard URLs you will use in this guide:

Analyzer list (Steps 1–2): https://developer.pdf4me.com/dashboard/#/ai-document-parser/
Analyzer detail (Steps 3–5): https://developer.pdf4me.com/ai-document-parser/?id=<analyzer-guid>

Important Facts You Should Not Miss

Parse extracts data; Classify routes documents

Pick Parse when you need structured JSON output with one value per defined field. Pick Classify when you only need a category label. The two analyzer types share the same Add modal but produce different output shapes.

fieldDescription drives accuracy

The AI relies on the fieldDescription string to find each value in the document. Be specific and include alternate names ("Sales Order Number, sometimes shown as SO No."). Vague descriptions reduce extraction quality.

Analyzer Id is the stable production reference

The string you type in the Add row (for example purchase_order_parser) is what every API call and automation module uses. Pick a clear naming convention up front; changing it later means rewiring every workflow that points to it.

Step 1: Open AI Document Parser in the dashboard

Sign in at dev.pdf4me.com.
From the dashboard sidebar, click AI Document Parser.
The list page shows every existing Analyzer with three columns: Analyzer Id, Analyzer Type (Parse or Classify), and Actions.
Click the blue + Add button to start a new Analyzer.

PDF4me AI Document Parser dashboard page with sidebar item AI Document Parser highlighted in red, main panel titled AI Document Parser, a blue Add button, and an empty list with column headers Analyzer Id, Analyzer Type, and Actions — AI Document Parser list view. Click + Add to create a new Analyzer.

Step 2: Pick analyzer type Parse and name it

A new row appears with three controls:

Analyzer Id input. type any clear identifier you will remember, for example purchase order parser, Invoice Parser v1, or vendor_statement_parser. There is no naming format restriction. snake_case, camelCase, kebab-case, plain words with spaces, all work the same.
Analyzer Type dropdown. pick Parse (this guide) or Classify (see the Classify guide).
Save / Cancel buttons. Save creates the Analyzer; Cancel discards the row.

PDF4me AI Document Parser Add row open with Analyzer Id input filled with purchase_order_parser, Analyzer Type dropdown showing Classify and Parse options with Parse selected, and Save and Cancel buttons on the right — Add row: type an Analyzer Id, pick Parse, then Save.

Naming tip: include the document family in the name (invoice parser, purchase_order_parser, Shipping Note Parser). The Analyzer Id is what downstream automations call; a clear name pays off every time you wire it into a new platform. Any casing or separator style works.

Step 3: Open the Analyzer and add a Schema

Click the new row to open the detail page. The URL pattern is:

https://developer.pdf4me.com/ai-document-parser/?id=<your-analyzer-guid>

The dashboard issues a GUID per Analyzer the first time you open it. Bookmark this URL to jump straight back to the same Analyzer next time.

The detail page shows:

Parse Info (left). shows the Analyzer Id you typed in Step 2. This panel is read-only.
Schemas (right). empty by default. Click the + button in the top-right to add a Schema.
Save Changes (top-left). persists any edits you make on this page.
Back. returns to the Analyzer list.

PDF4me AI Document Parser detail page with blue hero banner reading AI Document Parser, subtitle Automatically extract invoice data using AI with fast accurate and structured output, Back button, Save Changes button, Parse Info panel showing Analyzer Id purchase_order_parser, and an empty Schemas section with a blue plus button to add a schema — Analyzer detail. Click the blue + button to add a Document Schema.

Step 4: Define the Document Schema JSON

After clicking +, an empty Schema card opens. Paste a JSON object with two top-level keys:

description. one-sentence summary of what the schema extracts. The AI uses this as overall context.
fields. array of field definitions. Each field has:
- fieldName. machine-readable name (no spaces), used as the JSON key in the response.
- fieldType. string, number, or date. Drives parsing and validation.
- fieldDescription. natural-language hint the AI uses to find this field. Be specific.

PDF4me AI Document Parser Schema 1 card open with a Document Schema editor showing JSON with description Extracting data from purchase orders and sales orders, and a fields array with SalesOrderNumber string Sales Order Number, AgentName string Name of the agent mostly given as Agent Name, and DeptCode (partially visible). Below the editor are Invoice and Purchase Order quick-fill buttons. — Document Schema editor with quick-fill buttons for Invoice and Purchase Order templates.

Example: Purchase Order schema

{
  "description": "Extracting data from purchase orders and sales orders.",
  "fields": [
    {
      "fieldName": "SalesOrderNumber",
      "fieldType": "string",
      "fieldDescription": "Sales Order Number"
    },
    {
      "fieldName": "AgentName",
      "fieldType": "string",
      "fieldDescription": "Name of the agent, mostly given as Agent Name."
    },
    {
      "fieldName": "DeptCode",
      "fieldType": "string",
      "fieldDescription": "Department or cost-centre code printed on the order."
    },
    {
      "fieldName": "OrderDate",
      "fieldType": "date",
      "fieldDescription": "Date the order was placed. Accept formats like DD/MM/YYYY, MM-DD-YYYY, or written out as 5 June 2026."
    },
    {
      "fieldName": "TotalAmount",
      "fieldType": "number",
      "fieldDescription": "Grand total of the order in the document's currency, after taxes."
    }
  ]
}

Example: Invoice schema (use the Invoice quick-fill button)

{
  "description": "Extracting data from supplier invoices.",
  "fields": [
    {
      "fieldName": "InvoiceNumber",
      "fieldType": "string",
      "fieldDescription": "Unique invoice identifier, sometimes shown as Invoice No. or INV."
    },
    {
      "fieldName": "InvoiceDate",
      "fieldType": "date",
      "fieldDescription": "Date the invoice was issued."
    },
    {
      "fieldName": "DueDate",
      "fieldType": "date",
      "fieldDescription": "Date payment is due, sometimes shown as Payment Due or Net Due."
    },
    {
      "fieldName": "VendorName",
      "fieldType": "string",
      "fieldDescription": "Company name of the supplier or vendor sending the invoice."
    },
    {
      "fieldName": "TotalAmount",
      "fieldType": "number",
      "fieldDescription": "Grand total in the invoice currency, including taxes."
    }
  ]
}

Quick-fill buttons (Invoice, Purchase Order): at the bottom of the Schema card. Use them as a starting point only, click one to load a typical schema for that document family, then rename, trim, or extend the fields to match your real documents. The presets are scaffolding, not final schemas.

Schema with a table field (nested rows)

Use fieldType: "table" when you need to extract repeated rows such as invoice line items or purchase-order line items. Each table field carries its own nested fields array describing the columns.

{
  "description": "Invoice data extractor",
  "fields": [
    {
      "fieldName": "invoiceNumber",
      "fieldType": "string",
      "fieldDescription": "Invoice number / bill number / receipt number"
    },
    {
      "fieldName": "invoiceDate",
      "fieldType": "date",
      "fieldDescription": "Look for labels: 'Invoice Date', 'Bill Date', 'Date', 'Dated', 'Issue Date', 'Doc Date'. If 4 digit year not found then consider 2 digit year at the end of extracted date.",
      "fieldMethod": "generate"
    },
    {
      "fieldName": "lineItems",
      "fieldType": "table",
      "fieldDescription": "All product / service rows from the invoice table. Be careful, sometimes a row can be part of the next item like when description goes over one line, but it's of a single item.",
      "fields": [
        {
          "fieldName": "itemNumber",
          "fieldType": "string",
          "fieldDescription": "Product number, product id number or product code"
        },
        {
          "fieldName": "hsnCode",
          "fieldType": "string",
          "fieldDescription": "HSN / SAC code (4 to 8 digit)"
        }
      ]
    }
  ]
}

Field attributes

Attribute	Required?	What it does
`fieldName`	Required	The name of the field and how it will appear in the response JSON.
`fieldType`	Required	The type of data to extract. One of string, number, date, or table.
`fieldDescription`	Required	Natural-language description of what needs to be extracted and where to find it. Include alternate labels and example formats so the AI matches correctly.
`fieldMethod`	Optional (default extract)	How the AI fills the value. extract takes the value verbatim from the document. generate tells the AI to derive or normalise it (useful for dates, computed totals, or cleaned-up IDs). Omit for default extract behaviour.
`fields`	Required when fieldType is table	Nested array describing the columns of the table. Each entry takes the same attributes as a top-level field (fieldName, fieldType, fieldDescription, fieldMethod). Cannot itself be table.

fieldType reference

`fieldType`	Best for	Sample fieldDescription
`string`	Names, identifiers, codes, free text	Customer name as printed on the invoice header.
`number`	Amounts, quantities, tax rates, counts	Grand total of the order in the document currency, including taxes.
`date`	Dates, due dates, issue dates, timestamps	Date the invoice was issued, accept DD/MM/YYYY and 5 June 2026 formats.
`table`	Repeated rows (line items, addresses, transactions)	All product / service rows from the invoice table. Carries a nested fields array describing the columns.

Step 5: Save Changes and use the Analyzer

Click Save Changes at the top-left to persist the schema. The Analyzer is now active and can be referenced from any platform by its Analyzer Id.

Use the Analyzer in API or automation calls

Once saved, the same Analyzer runs anywhere by reference. You do not need to recreate the schema on each platform.

Field	Source	Purpose
`AnalyzerId`	The string you typed in Step 2	Stable identifier the AI Parser uses to locate your schema.
`docName`	Source PDF filename	Used for tracking and error messages.
`docContent`	Source PDF encoded as Base64	The document to extract from.
`async`	`false` for synchronous, `true` for polling	Controls response delivery.

Example REST request body:

{
  "docName": "purchase_order.pdf",
  "docContent": "BASE64_ENCODED_PDF_CONTENT",
  "AnalyzerId": "purchase_order_parser",
  "async": false
}

The response contains one field per item you defined in fields. Route that JSON into any downstream system: Google Sheets, Airtable, a database, Excel, or a webhook.

Common workflows

Typical AI Parser patternsHow a saved Analyzer moves from dashboard to production.

Purchase order inbox to ERP

A new purchase order PDF arrives in a monitored mailbox or upload folder.
Make, Zapier, Power Automate, or n8n calls the AI Parser with AnalyzerId: purchase_order_parser.
The returned JSON (SalesOrderNumber, AgentName, OrderDate, TotalAmount) is mapped into your ERP order-creation API.
A confirmation email goes back to the customer using the parsed order number.

Invoice inbox to accounting spreadsheet

An invoice PDF arrives via webhook, watched folder, or shared inbox.
The AI Parser is called with AnalyzerId: invoice_parser (Invoice quick-fill schema).
The structured response (InvoiceNumber, TotalAmount, DueDate) is appended as a row in Google Sheets or Excel for the accounting team.

Classify-then-Parse router

A single watched folder receives mixed documents (invoices, purchase orders, shipping notes).
A Classify Analyzer routes each file to the right category label.
Based on the label, the workflow calls the matching Parse Analyzer (invoice_parser, purchase_order_parser, shipping_note_parser) and writes the structured output to the right destination.

Schema best practices

Use descriptive fieldDescription strings. Mention alternate names you see in real documents ("Sales Order Number, also shown as SO No., Order Ref, or PO Ref").
Pick fieldType carefully. date and number give the engine parsing hints; string is the fallback when shape is unpredictable.
Keep fieldName machine-readable (camelCase or PascalCase, no spaces). It appears as a JSON key in the response.
Start from the Invoice or Purchase Order quick-fill, then trim or add fields. The presets are good baselines.
Test against three real samples before pointing production traffic at the Analyzer, including edge cases like missing optional fields, second-page documents, and OCR-derived text.
Version Analyzer Ids when making breaking schema changes (invoice_parser_v1, invoice_parser_v2) so live automations can migrate at their own pace.

AI Document Parser using Classify

Same dashboard, multiple Schemas per Analyzer. Use when one Analyzer must route between document variants (Client ABC vs Client XYZ).

AI - Document Parser in Make

Use your saved Parse Analyzer Id from a Make scenario. Pair it with any data-source and storage modules to run AI extraction in visual workflows.

AI - Document Parser in Power Automate

Run the same Analyzer from a Power Automate flow. Connects naturally to SharePoint, Excel, Dataverse, and Outlook.

AI Document Parser in n8n

Self-hosted or cloud n8n workflows. The AI Analyzer Id dropdown loads your saved Analyzers directly from your account.

Frequently Asked Questions

What is the difference between Parse and Classify in AI Document Parser?+

Parse extracts structured field values from documents using a Document Schema. The response is a JSON object with one value per field you defined. Classify routes a document into one of several categories you defined; the response is a label string. Use Parse when you need data, Classify when you need a routing decision.

How is AI Document Parser different from the older Parse Document templates?+

The older Parse Document setup uses Regex Expression or JavaScript Expression applied to drawn capture areas on a sample PDF. AI Document Parser uses a JSON Document Schema with natural-language field descriptions; the AI engine reads the document semantically rather than by fixed positions. No sample PDF is required when defining an AI Analyzer.

What does fieldDescription do?+

It is the natural-language hint the AI uses to find each field in the document. Be specific and include alternate names ("Sales Order Number, sometimes shown as SO No."). Descriptions are the single biggest factor in extraction accuracy.

Can I add more than one Schema to the same Analyzer?+

Yes. The detail page supports multiple Schemas via the + button. Use this when one Analyzer should handle two related document families (purchase orders and sales orders) with slightly different field sets.

Does the Analyzer Id change after Save Changes?+

No. The Analyzer Id is the string you typed in the Add row in Step 2. It stays stable across edits and is the value every API call and automation module uses. Pick a clear naming convention up front.

Do I need to upload a sample PDF when setting up the Analyzer?+

No. Unlike the older Parse Document templates, the AI Analyzer does not need a sample for setup. You define a Document Schema in JSON; the AI engine applies it to whatever document you send at runtime.

How do I call the Analyzer from the REST API?+

Send docName, docContent (the PDF as Base64), AnalyzerId (the string you typed in Step 2), and async (false for immediate response, true for polling). The response contains one JSON property per field you defined in the Document Schema.

Can the AI Parser handle scanned PDFs?+

Yes, when the PDF has been OCR-processed first. Run the source file through the PDF4me OCR endpoint before sending it to the AI Parser. The schema then extracts from the OCR text layer the same way it would from a native PDF.

Get Help

Open AI Document Parser Classify Guide Get API Key

Authenticating Your Setup​

Important Facts You Should Not Miss​

Step 1: Open AI Document Parser in the dashboard​

Step 2: Pick analyzer type Parse and name it​

Step 3: Open the Analyzer and add a Schema​

Step 4: Define the Document Schema JSON​

Example: Purchase Order schema​

Example: Invoice schema (use the Invoice quick-fill button)​

Schema with a table field (nested rows)​

Field attributes​

fieldType reference​

Step 5: Save Changes and use the Analyzer​

Use the Analyzer in API or automation calls​

Common workflows​

Schema best practices​

Related actions​

Frequently Asked Questions​

Get Help​