Parse Document - Template Extraction API
PDF4me Parse Document enables you to parse documents using template-based parsing. This API service processes PDF files and extracts structured data based on document templates. The API receives PDF content and template configurations through REST API calls, utilizing Base64 encoding for secure transmission. With support for custom templates and structured data extraction, this solution is ideal for document processing and data extraction workflows.
Authenticating Your API Request
To access the PDF4me REST API, every request must include proper authentication credentials. Authentication ensures secure communication and validates your identity as an authorized user of the REST API.
Key Features
- Template-Based Parsing: Parse documents using custom templates for structured data extraction
- Template Configuration: Support for custom templates created in the PDF4me dashboard
- Structured Data Output: Extract organized, structured data from unstructured PDF documents
- Base64 Encoding: Secure file content transmission using Base64 encoding
- Simple API Integration: RESTful API designed for automated document parsing workflows
REST API Endpoint
The PDF4me REST API uses standard HTTP methods to interact with resources. All document parsing operations are performed through a single endpoint:
- Method: POST
- Endpoint:
/api/v2/ParseDocument
REST API Parameters
Complete list of parameters for the Parse Document REST API. Parameters are organized by category for better understanding and implementation.
Important: Parameters marked with an asterisk (*) are required and must be provided for the API to function correctly.
Required Parameters
| Parameter | Type | Description | Example |
|---|---|---|---|
| docName* | String | Name of the PDF file to be parsed | document.pdf |
| TemplateId* | String (Guid) | Unique identifier (GUID) for the parsing template. Get this from your PDF4me dashboard after creating a parse template | 12345678-1234-1234-1234-123456789abc |
| ParseId* | String (Guid) | Unique identifier (GUID) for the parsing operation. This can be generated client-side or provided by the API | 87654321-4321-4321-4321-cba987654321 |
Optional Parameters
| Parameter | Type | Description | Example |
|---|---|---|---|
| docContent | Base64 (String) | The content of the input PDF file in Base64 format. If not provided, the document will be fetched using docName | JVBERi... |
| TemplateName | String | Name of the template for parsing. Alternative to TemplateId. Use the template name from your PDF4me dashboard | invoice_template |
| async | Boolean | Enable asynchronous processing. When true, the API returns 202 Accepted with a Location header for polling the result | true |
Output
The PDF4me Parse Document REST API returns different responses based on the processing mode. The API returns parsed document data as a JSON response.
- Success Response
- Asynchronous Processing
- Error Responses
- Response Format Details
Synchronous Processing (Default)
When async is false or not provided, the API returns the parsed document data immediately.
Status Code: 200 OK
Response Format:
{
"parsedData": {
"field1": "value1",
"field2": "value2",
"field3": "value3"
},
"documentType": "invoice",
"pageCount": 1,
"confidence": 0.95
}
The response contains parsed data extracted based on the template configuration, including field values, document type, page count, and confidence score.
Asynchronous Processing
When async is true, the API processes the document asynchronously.
Initial Response:
Status Code: 202 Accepted
Response Headers:
Location: https://api.pdf4me.com/api/v2/ParseDocumentStatus/{operationId}
Response Body:
{
"traceId": "parsing-operation-trace-id"
}
Polling for Results:
Use the Location header URL to poll for completion:
const response = await fetch(locationUrl, {
headers: { 'Authorization': 'Basic ' + apiKey }
});
// Continue polling until status code is 200
if (response.status === 200) {
const result = await response.json();
// Process parsed document data
}
Error Responses
| Status Code | Description | Example Response |
|---|---|---|
| 400 Bad Request | Invalid request parameters or missing required fields | {"error": "Missing required parameter: TemplateId"} |
| 401 Unauthorized | Invalid or missing API key | {"error": "Unauthorized"} |
| 408 Request Timeout | Request processing timeout | {"error": "Request timeout"} |
| 500 Internal Server Error | Server error during processing | {"error": "Internal server error"} |
Understanding the JSON Response
The parse document response is a JSON object containing:
- parsedData: Object containing extracted field values based on the template configuration
- documentType: The identified type of document
- pageCount: Number of pages in the document
- confidence: Confidence score of the parsing operation (0.0 to 1.0)
Template Configuration:
Before using this API, you must:
- Create a parse template in your PDF4me dashboard
- Configure capture areas and field mappings
- Test the template to ensure it works correctly
- Obtain the TemplateId (GUID) or TemplateName from your dashboard
Getting Template and Parse IDs:
- TemplateId: Found in your PDF4me dashboard after creating a parse template
- ParseId: A unique GUID that identifies the parsing operation (can be generated client-side using any GUID generator)
Request Example
Header
Content-Type: application/json
Authorization: Basic YOUR_BASE64_ENCODED_API_KEY
Note:
- Get your API key from the PDF4me Dashboard
- The API key must be Base64 encoded and prefixed with "Basic " in the Authorization header
- Example: If your API key is
abc123, encode it to Base64 and useAuthorization: Basic YWJjMTIz
Payload
Basic Request:
{
"docContent": "JVBERi0xLjQKJeLjz9MKMSAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFnZXMgMiAwIFIKPj4KZW5kb2JqCjIgMCBvYmoKPDwKL1R5cGUgL1BhZ2VzCi9LaWRzIFszIDAgUl0KL0NvdW50IDEKPD4KZW5kb2JqCjMgMCBvYmoKPDwKL1R5cGUgL1BhZ2UKL1BhcmVudCAyIDAgUgovTWVkaWFCb3ggWzAgMCA2MTIgNzkyXQovUmVzb3VyY2VzIDw8Ci9Gb250IDw8Ci9GMSA0IDAgUgo+Pgo+PgovQ29udGVudHMgNSAwIFIKPj4KZW5kb2JqCjQgMCBvYmoKPDwKL1R5cGUgL0ZvbnQKL1N1YnR5cGUgL1R5cGUxCi9CYXNlRm9udCAvSGVsdmV0aWNhCj4+CmVuZG9iago1IDAgb2JqCjw8Ci9MZW5ndGggNDQKPj4Kc3RyZWFtCkJUCi9GMSAxMiBUZgoxMDAgNzAwIFRkCihIZWxsbyBXb3JsZCkgVGoKRVQKZW5kc3RyZWFtCmVuZG9iagp4cmVmCjAgNgowMDAwMDAwMDAwIDY1NTM1IGYgCjAwMDAwMDAwMDkgMDAwMDAgbiAKMDAwMDAwMDA1NCAwMDAwMCBuIAowMDAwMDAwMTAxIDAwMDAwIG4gCjAwMDAwMDAxNzAgMDAwMDAgbiAKMDAwMDAwMDI0NCAwMDAwMCBuIAp0cmFpbGVyCjw8Ci9TaXplIDYKL1Jvb3QgMSAwIFIKPj4Kc3RhcnR4cmVmCjM0MQolJUVPRg==",
"docName": "document.pdf",
"TemplateId": "12345678-1234-1234-1234-123456789abc",
"ParseId": "87654321-4321-4321-4321-cba987654321"
}
With Template Name (Alternative to TemplateId):
{
"docContent": "JVBERi0xLjQKJeLjz9MK...",
"docName": "document.pdf",
"TemplateName": "invoice_template",
"ParseId": "87654321-4321-4321-4321-cba987654321"
}
With Asynchronous Processing:
{
"docContent": "JVBERi0xLjQKJeLjz9MK...",
"docName": "document.pdf",
"TemplateId": "12345678-1234-1234-1234-123456789abc",
"ParseId": "87654321-4321-4321-4321-cba987654321",
"async": true
}
Code Samples
The PDF4me Parse Document REST API provides code samples in multiple programming languages. Choose the language that best fits your development environment:
- C#
- Java
- JavaScript
- Python
- Salesforce
- n8n
- Google Script
- AWS Lambda
Google Script Sample
Google Apps Script implementation for Google Workspace integration:
Document Parsing Features
Template Processing
- Custom Templates: Support for user-defined templates for specific document types
- Template Matching: Intelligent template recognition and matching for accurate parsing
- Flexible Configuration: Customizable parsing parameters and extraction rules
- Professional Processing: High-quality document analysis with accurate template matching
- Advanced Recognition: Support for complex document structures and layouts
Document Analysis
- Content Recognition: Advanced content analysis and data identification
- Structure Analysis: Intelligent document structure recognition and parsing
- Data Extraction: Precise extraction of structured data from unstructured documents
- Format Support: Support for various PDF document formats and layouts
- Professional Results: Reliable document parsing with accurate data extraction
Advanced Features
- Intelligent Parsing: AI-powered document analysis and content recognition
- Custom Extraction: Flexible data extraction based on template configurations
- Professional Analysis: High-quality document parsing with clear data structure
- Flexible Templates: Support for any document type and parsing requirements
Industry Use Cases & Applications
- Finance & Banking
- Legal & Professional Services
- Business & Enterprise
- Government & Compliance
- Technology & Development
Finance & Banking Use Cases
- Invoice Processing: Parse invoices and extract structured data for accounting systems
- Financial Reports: Extract structured data from financial reports and documents
- Compliance Monitoring: Parse compliance documents and extract regulatory information
- Financial Analysis: Extract insights and data from financial documents
Legal & Professional Services Use Cases
- Contract Analysis: Extract key terms and data from legal contracts and agreements
- Legal Document Processing: Parse legal documents and extract structured data
- Compliance Monitoring: Parse compliance documents and extract regulatory information
- Legal Intelligence: Extract insights and data from legal documents
Business & Enterprise Use Cases
- Form Processing: Parse forms and extract field data for automated processing
- Report Analysis: Extract structured data from business reports and documents
- Document Intelligence: Extract insights and data from various document types
- Business Process Automation: Automate document processing workflows with intelligent parsing
Government & Compliance Use Cases
- Compliance Monitoring: Parse compliance documents and extract regulatory information
- Regulatory Documentation: Extract structured data from regulatory documents
- Government Reports: Parse government reports and extract data
- Public Records: Extract data from public records and documents
Technology & Development Use Cases
- Data Migration: Convert unstructured documents to structured data for migration
- Document Intelligence: Extract insights and data from various document types
- Data Processing: Extract structured data for system integration
- API Integration: Parse documents and extract data for API processing