AI Document Parser in n8n

What this node does

PDF4me AI Document Parser extracts structured data from any document by running a saved AI Analyzer from your PDF4me account against the file. Unlike fixed parsers (AI-Invoice Parser, AI-Process Contract), you pick the AI Analyzer Id from a dropdown of analyzers configured in the PDF4me dashboard, so the same node handles invoices, purchase orders, receipts, custom forms, lab reports, internal templates, anything you describe in a schema. Use a Parse Analyzer for a single schema or a Classify Analyzer for many document variants in one call. Returns dynamic JSON keyed by your schema plus a _metadata block, no binary file.

Authenticating Your API Request

Every PDF4me node in n8n requires a valid Credential to connect with. Create or select one that holds your PDF4me API key so the workflow can authenticate AI extraction requests securely. The same credential also powers the AI Analyzer Id dropdown via the GetAnalyzerId API.

Get your API Key Open AI Document Parser Dashboard

Important Facts You Should Not Miss

You set up the Analyzer in the dashboard, the node only references it

Define schema, fields, and prompts once in the PDF4me AI Document Parser dashboard (Parse for one schema, Classify for many variants). The n8n node loads your Analyzers into the AI Analyzer Id dropdown and runs whichever you pick. Editing the schema later does not require touching the workflow.

Output is dynamic JSON, no binary file

Top-level keys match the field names you defined on the Analyzer (string, number, date, or table rows). A _metadata object is always appended with success, message, timestamp, source filename, analyzer Id, and operation. There is no Output Binary Field Name.

Parse vs Classify is decided by the Analyzer, not the node

Same node, two analyzer modes. A Parse Analyzer returns the fields from its single schema. A Classify Analyzer routes the document to the best-matching Schema and returns its Classification Name plus the matched Schema's extracted fields. Switch behaviour by switching the Id.

AI Document Parser n8n node showing Credential Pdf4me, Resource Extract, Extract Operations AI Document Parser, Input Data Type Binary Data, Input Binary Field data, Document Name document.pdf, and AI Analyzer Id dropdown unset (highlighted as required) — Full node panel: Extract → AI Document Parser with Binary Data input. Pick an **AI Analyzer Id** before running.

Parameters

Required in the n8n UI: Credential to connect with, Input Data Type, Document Name, AI Analyzer Id. Conditional required fields (Input Binary Field, Base64 Document Content, or Document URL) appear only when the matching Input Data Type is chosen. There is no Output Binary Field Name and no Advanced Options / Custom Profiles on this operation.

Parameter	Required	What it does	Example
Input Data Type	Yes	How the document is supplied. Binary Data reads from a previous node (default, most common). Base64 String accepts an encoded document. URL downloads the document from a public link.	`Binary Data`
Input Binary Field	Conditional	Name of the binary property on the incoming n8n item that holds the document. Required when Input Data Type is Binary Data. Defaults to data. Error reports which property was missing when the name is wrong.	`data`
Base64 Document Content	Conditional	Base64-encoded document. Required when Input Data Type is Base64 String. Data-URL style prefixes (text before the comma) are stripped automatically.	`JVBERi0xLjQK...`
Document URL	Conditional	Publicly reachable HTTPS URL to the document. Required when Input Data Type is URL. Must be a valid full URL.	`https://example.com/document.pdf`
Document Name	Yes	Filename used by PDF4me for processing. Default document.pdf. With Binary Data, the uploaded file name is used for upload but your explicit Document Name value takes precedence in the API call when set. Use a real extension (.pdf, .png, .jpg, etc.) so PDF4me handles the file correctly.	`document.pdf`
AI Analyzer Id	Yes	The saved Analyzer that controls extraction. Dropdown populated from your PDF4me account via the GetAnalyzerId API. Sent to PDF4me as customisationNote. Pick a Parse Analyzer for a single schema or a Classify Analyzer for many document variants.	`aaaa133`

Input Data Type Options

Pick how the document enters the node. The follow-up fields change with this choice.

Input Data Type dropdown in n8n PDF4me AI Document Parser node showing three options: Binary Data highlighted with description Use document file from previous node, Base64 String with description Provide document content as base64 encoded string, URL with description Provide URL to document file — Input Data Type. Binary Data (default), Base64 String, or URL.

Binary DataDefault: file from previous node

Reads the document directly from a prior node such as Google Drive Download, HTTP Request, Read Binary File, or an email-attachment node. Set Input Binary Field to the binary field name (default data). The uploaded filename is used unless you override Document Name.

Base64 StringEncoded document inline

Paste or map a base64 string into Base64 Document Content. Data-URL style prefixes (text before the comma) are stripped automatically. Document Name with the correct extension drives format detection.

URLDownload from a public link

Provide a public HTTPS URL in Document URL. PDF4me fetches the document directly. The filename for processing is taken from the URL path when possible, or from Document Name.

The dropdown is populated from your PDF4me account via the GetAnalyzerId API when you open the field. Each option is an Analyzer you saved in the PDF4me dashboard. Pick the one whose schema matches the document type you are processing.

AI Analyzer Id dropdown open in n8n PDF4me AI Document Parser node showing three example Analyzer options: aaaa133, TestTenantProject, TestTenant — AI Analyzer Id dropdown loaded from your PDF4me account.

Parse Analyzer

One Document Schema. The node returns exactly the fields defined in that schema. Best when every document you process shares the same layout family. See the Parse setup guide for schema rules including fieldType: "table" for nested rows and fieldMethod (extract default, generate to derive / normalise).

Classify Analyzer

Many Schemas, one per document variant. The AI routes the incoming document to the best-matching Schema using each Schema's Classification Prompt, then returns the matched Classification Name plus its fields. See the Classify setup guide.

Empty dropdown? The node shows "GetAnalyzerId returned no analyzer options" when your account has no Analyzers or when the credential cannot reach the API. Open the AI Document Parser dashboard and add at least one Analyzer, then verify the API key in your n8n credential.

Output Fields

A successful run returns one n8n item with JSON only, no binary file. The top-level keys match your Analyzer schema; a _metadata object is always appended.

Field	Type	What it contains
`Top-level extracted fields`	Dynamic	One key per field defined in your Analyzer schema (string, number, date, or nested table rows). Names match fieldName exactly.
`_metadata.success`	Boolean	true on a successful parse.
`_metadata.message`	String	"Document parsed successfully using AI Document Parser".
`_metadata.processingTimestamp`	String	ISO timestamp of the parse.
`_metadata.sourceFileName`	String	Document name used for processing.
`_metadata.aiAnalyzerId`	String	The Analyzer Id you selected (sent as customisationNote to the API).
`_metadata.operation`	String	"aiDocumentParser".
`rawContent (fallback only)`	String	If the API returns plain text instead of JSON and parsing fails, the node wraps the response as { "rawContent": "<api response string>" } at the top level.

Quick Setup

Build your Analyzer first. Open the AI Document Parser dashboard, click + Add, type a clear Analyzer Id (any casing or separator works), pick Parse or Classify, save the row, then add a Document Schema describing the fields you want extracted. See the Parse setup guide or Classify setup guide for full schema rules, including fieldType: "table" for nested rows and fieldMethod (extract or generate).
In your n8n workflow, click + and search for PDF4me. Set Resource to Extract and Extract Operations to AI Document Parser.
In Credential to connect with, select your PDF4me credential or paste an API key.
Set Input Data Type. Binary Data (default) reads from a previous node and is most common.
Fill the matching input field (Input Binary Field, Base64 Document Content, or Document URL).
Document Name. Override the default document.pdf to use the real filename with the right extension.
Pick an AI Analyzer Id from the dropdown (loaded from your account).
Execute the node. The output item carries the parsed fields at the top level plus _metadata. Route into Set, Code, Google Sheets, Airtable, a database, an email, or any downstream node.

Typical Setups

Workflow ExamplesCommon n8n workflow patterns using AI Document Parser.

Email attachment to Google Sheets

Email Trigger (IMAP) fires on a new message with PDF attachment.
AI Document Parser runs with Input Data Type Binary Data, Document Name set to the attachment filename, AI Analyzer Id set to your invoice_parser Analyzer.
Google Sheets Append writes invoiceNumber, vendorName, totalAmount, dueDate into the AP tracker.

Multi-vendor inbox with Classify

Watched folder receives mixed PDFs from Client ABC, Client XYZ, and a long tail.
AI Document Parser is set to a Classify Analyzer Id (one Schema per vendor).
A Switch node branches on the matched Classification Name returned in the response, sending each result to the matching downstream table.

Hosted document by URL

Webhook receives a document link from your portal.
AI Document Parser runs with Input Data Type URL and your custom form Analyzer Id.
Output JSON is mapped into a Dataverse / Airtable / Postgres write to create a new record.

Scan, auto-crop, then parse

Scanner output lands in a folder with wide borders.
AI Auto Crop Document (Resource AI) trims the borders.
AI Document Parser runs on the cropped binary with the matching Analyzer Id.
Parsed JSON flows downstream as usual.

Validate then enrich

AI Document Parser returns the structured fields.
A Code node validates required fields and applies business rules.
If valid, an HTTP Request enriches with vendor data from your CRM; if not, an alert goes to Slack for manual review.

Practical Tips

Set up the Analyzer in PDF4me first

The node does not define extraction rules. It runs whatever your selected Analyzer specifies. Build it in the dashboard before configuring the node.

Use Document Name with the real extension

PDF4me uses the extension for format detection. Even on Binary Data, an explicit Document Name with the correct extension is safest.

No Output Binary Field Name and no Advanced Options

Unlike AI Auto Crop Document, this operation has neither. The output is JSON only; there is nothing to write back as a binary file.

Plain-text fallback wraps as rawContent

If the API returns plain text instead of JSON and parsing fails, the node wraps the response as { "rawContent": "<string>" } so the workflow does not break.

Combine with AI Auto Crop Document for scans

Scans with wide borders parse better after cropping. Chain AI Auto Crop Document (Resource AI) before this node.

Use AI-Invoice Parser / AI-Process Contract for built-in schemas

When you only need standard invoice or contract fields, the dedicated nodes are simpler. AI Document Parser shines when you have a custom schema.

Troubleshooting

"AI Analyzer Id is required"

Choose an Analyzer from the dropdown before running. The field cannot stay empty.

"No binary data found in property '…'"

Align Input Binary Field with the previous node (often data). The error reports which property was missing.

"Document content is required"

Empty base64 or missing file. Provide content on the matching Input Data Type field.

"Invalid URL format"

Check Document URL is a full, valid URL (scheme + host + path).

"GetAnalyzerId returned no analyzer options"

No Analyzers on the account, or a credentials issue. Create an Analyzer in the PDF4me dashboard or verify the API key in the n8n credential.

Cheat Sheet

Field	Value
Resource	`Extract`
Operation	`AI Document Parser`
Input Data Type	`Binary Data`
Input Binary Field	`data`
Document Name	`form.pdf (with the real extension)`
AI Analyzer Id	`(pick from the dropdown loaded from your account)`
Credentials	`PDF4me API credential`

Frequently Asked Questions

Where does the AI Analyzer Id list come from?+

The dropdown is populated from your PDF4me account via the GetAnalyzerId API when you open the field. Each option is an Analyzer you created in the PDF4me AI Document Parser dashboard. The node sends the selected value as customisationNote to the AiDocumentParser API.

What does "GetAnalyzerId returned no analyzer options" mean?+

The account has no Analyzers configured, or the credentials cannot reach the GetAnalyzerId endpoint. Open the PDF4me AI Document Parser dashboard and create at least one Analyzer (Parse for a single schema, Classify for multiple document variants), or verify the API key in your n8n credential.

How is this different from AI-Invoice Parser, AI-Process Contract, AI-Process Bank Cheque?+

Those nodes use fixed schemas tuned to those document types. AI Document Parser is schema-driven: it runs whatever Analyzer you select, so the same node handles invoices, purchase orders, receipts, custom forms, lab reports, anything you describe in a schema. Pick AI Document Parser whenever you need a custom layout or document type.

What shape does the output have?+

Top-level keys match the field names in your Analyzer schema (for example customerName, invoiceNumber, totalAmount, lineItems). Nested objects and table arrays follow the same shape you defined. A _metadata object is always appended with success, message, processingTimestamp, sourceFileName, aiAnalyzerId, and operation. No binary file is returned.

Does this work on scanned PDFs?+

Yes. The AI engine handles OCR internally, so scanned PDFs and image formats work the same as native PDFs. If borders are large, chain AI Auto Crop Document (Resource AI) before this node for cleaner extraction.

Can I change the Analyzer schema without rebuilding the workflow?+

Yes. The workflow only references the Analyzer Id. Edit the schema in the PDF4me dashboard and the next run picks up the new shape automatically. Output keys follow your latest schema, so any downstream Set / Code / Sheets nodes that assume specific keys may need updating after a breaking schema change.

Can I switch between Parse and Classify Analyzers without rewiring the node?+

Yes. Both Analyzer types are listed in the AI Analyzer Id dropdown. Pick a Classify Analyzer when you want one node to handle several document variants; pick a Parse Analyzer for a single fixed schema. The output shape adapts to whichever you choose.

What happens when the API returns plain text instead of JSON?+

The node tries to parse it; if parsing fails, the response is wrapped as { "rawContent": "<api response string>" } at the top level so the workflow does not break. Useful for inspecting unexpected responses during development.

AI Document Parser using Parse (setup guide)

Build the Parse Analyzer this node references. One schema, returns extracted fields per your shape, including table fields and fieldMethod (extract / generate).

AI Document Parser using Classify (setup guide)

Build a Classify Analyzer with many Schemas. The node returns the matched Classification Name plus its fields.

Parse Document (regex / JavaScript)

Older template-based extraction using Regex or JavaScript expressions on drawn capture areas. Use when you want pixel-perfect control over a fixed layout instead of AI.