Skip to main content

Parse Document using n8n action

PDF4me Parse Document extracts structured data from PDF documents using AI-powered parsing templates through n8n automation workflows. Process PDFs via n8n triggers, binary data, base64 strings, or public URLs with predefined Parse IDs to automatically capture specific data fields, validate extracted information, and output structured data in JSON, XML, or CSV formats. This solution is ideal for invoice data extraction, form processing, document digitization, automated data capture, field-specific extraction, and template-based parsing workflows that require accurate field recognition with customizable extraction rules and seamless integration.

Setup

Add the PDF4me "Parse Document" node to your n8n workflow and configure the required parameters. For initial setup instructions, see our n8n Integration Guide.

Prerequisites:

  • PDF4me API credentials
  • n8n workflow access
  • Parse ID configuration (see below)

Configuration:

  1. Add PDF4me node to workflow
  2. Select "Parse Document" action
  3. Configure input parameters (see below)
Parse Document Configuration

Parameters

Complete list of parameters for the Parse Document action. Configure these parameters to control document parsing.

Important: Parameters marked with an asterisk (***) are required and must be provided for the action to function correctly.

ParameterTypeDescriptionExample
Input Data Type***StringPDF Input Format Selection
• Choose the format of your PDF document input
• PDF4me supports multiple input types
• Options: Binary Data, Base64 String, or URL
Binary Data
Input Binary FieldBinaryBinary PDF File Input (Required if Binary Data)
• Reference PDF file from previous n8n node or file upload
• PDF4me processes binary PDF files with automatic format detection
• Required when Input Data Type is "Binary Data"
{{ $binary.data }}
Base64 PDF ContentStringBase64 Encoded PDF Input (Required if Base64 String)
• Provide PDF content as base64 encoded string
• PDF4me automatically decodes and processes the PDF content
• Required when Input Data Type is "Base64 String"
JVBERi0xLjQ...
PDF URLStringPublic PDF URL Input (Required if URL)
• Provide a public/open permission URL to the PDF file
• PDF4me downloads and processes the file from URL
• Required when Input Data Type is "URL"
https://abc.com/document.pdf
Document Name***StringInput Filename
• Specify the name of the input PDF file
• Used for format detection and processing optimization
• Must include .pdf extension
document.pdf
Parse ID***StringParsing Template Identifier
• Unique identifier for the parsing configuration template
• References predefined parsing template that defines data fields to extract
• Must be a valid Parse ID from your PDF4me account
8761-4321-4321-4321-cba
Output Format***StringOutput Data Format
• Choose the format for the extracted data output
• Options: JSON (structured), XML (hierarchical), CSV (tabular)
• JSON is recommended for most automation workflows
JSON
Output Binary Field Name***StringBinary Data Mapping
• Define the variable name for accessing generated parsed data
• Used in subsequent workflow actions
• Essential for workflow data flow
data

Advanced Options

The following parameters are available in the Advanced Options section and are optional:

ParameterTypeDescriptionExample
Custom ProfilesStringCustom Configuration Profiles
• Set additional options using custom profiles
• JSON-like format containing predefined parameters
• Enables advanced parsing processing settings
• Optional for specialized requirements
{ "outputDataFormat": "json", "includeMetadata": true }

Parse ID Configuration

Understanding Parse IDs

Parse IDs are unique identifiers that reference predefined parsing templates in your PDF4me account. Each Parse ID contains specific configuration for:

  • Field Recognition: Which data fields to extract from documents
  • Data Validation: Rules for validating extracted data
  • Output Formatting: How to structure the extracted data
  • Error Handling: How to handle parsing errors and exceptions

Getting Parse IDs

  1. Access PDF4me Dashboard: Log into your PDF4me account
  2. Navigate to Parse Templates: Go to the parsing configuration section
  3. Create or Select Template: Choose an existing template or create a new one
  4. Copy Parse ID: Copy the unique identifier for use in your n8n workflow

Common Parse Templates

Template TypeUse CaseCommon Fields
Invoice ParsingExtract invoice dataInvoice number, date, amount, vendor, line items
Contract ParsingExtract contract termsParties, dates, terms, clauses, signatures
Form ParsingExtract form dataPersonal information, selections, responses
Receipt ParsingExtract receipt dataDate, merchant, items, total, payment method

Output

Output Parameters

ParameterTypeDescriptionExample
fileNameStringPDF4me generated filename - The complete filename of the successfully processed document with proper extension and timestamp. PDF4me ensures unique naming and validates file format compliance for seamless integration with downstream processesparsed_document_1756999697398.json
mimeTypeStringPDF4me MIME type identifier - The standardized MIME type for the extracted data file, typically application/json for JSON format, application/xml for XML format, or text/csv for CSV format. This ensures proper file handling and recognition across all systems and applicationsapplication/json
fileSizeNumberPDF4me file size in bytes - The exact size of the extracted data file in bytes, provided for storage planning, bandwidth optimization, and file transfer monitoring. Essential for enterprise document management and workflow automation3421
successBooleanPDF4me parsing status indicator - Boolean flag indicating the success or failure of the document parsing process. Returns true for successful parsing and false for any errors, enabling robust error handling in automated workflowstrue
messageStringPDF4me parsing status message - Descriptive message indicating the result of the document parsing process. Provides clear status messages for successful parsing and detailed error information for troubleshooting purposesDocument parsed successfully
docNameStringPDF4me original document name reference - The original filename of the input PDF file that was processed. This reference is maintained for audit trails, debugging purposes, and tracking the source of extracted data in enterprise workflowsdocument.pdf

N8N Action Response

The PDF4me Parse Document API returns a response that can be viewed in multiple formats. Choose the view that best fits your needs:

JSON Response Format

The raw JSON response from the API:

{
"fileName": "parsed_document_1756999697398.json",
"mimeType": "application/json",
"fileSize": 3421,
"success": true,
"message": "Document parsed successfully",
"docName": "document.pdf"
}

Use Cases

Invoice Processing and Accounts Payable

  • Automatically extract invoice data for accounting systems
  • Process vendor invoices and payment information
  • Integrate with ERP systems for automated data entry

Contract Management and Legal Processing

  • Extract key terms and clauses from legal documents
  • Process contract renewals and compliance monitoring
  • Automate legal document review and analysis

Form Processing and Data Entry

  • Extract data from application forms and surveys
  • Process customer onboarding documents
  • Automate data entry for business processes

Get Help