Extract Form Data from PDF - Field Parser API
PDF4me Extract Form Data From PDF enables you to extract all form field data and values from PDF documents containing fillable forms. This API service processes PDF files and extracts form field information, values, and properties from PDF forms. The API receives PDF content through REST API calls, utilizing Base64 encoding for secure transmission. With support for various form field types and structured data output, this solution is ideal for form processing workflows and data extraction platforms.
Authenticating Your API Request
To access the PDF4me REST API, every request must include proper authentication credentials. Authentication ensures secure communication and validates your identity as an authorized user of the REST API.
Key Features
- Form Field Extraction: Extract all types of form fields including text fields, checkboxes, radio buttons, dropdowns, and signature fields
- Structured Data Output: Retrieve form data in structured JSON format for easy integration
- Field Properties: Extract field names, values, types, and validation properties
- Base64 Encoding: Secure file content transmission using Base64 encoding
- Simple API Integration: RESTful API designed for automated form processing workflows
REST API Endpoint
The PDF4me REST API uses standard HTTP methods to interact with resources. All form data extraction operations are performed through a single endpoint:
- Method: POST
- Endpoint:
/api/v2/ExtractPdfFormData
REST API Parameters
Complete list of parameters for the Extract Form Data From PDF REST API. Parameters are organized by category for better understanding and implementation.
Important: Parameters marked with an asterisk (*) are required and must be provided for the API to function correctly.
Required Parameters
| Parameter | Type | Description | Example |
|---|---|---|---|
| docName* | String | Source PDF file name with proper .pdf extension for document identification and form data extraction processing | output.pdf |
| docContent* | Base64 (String) | The content of the input PDF file encoded in Base64 format for form field analysis and data extraction processing | JVBERi... |
Optional Parameters
| Parameter | Type | Description | Example |
|---|---|---|---|
| async | Boolean | Enable asynchronous processing. When true, the API returns 202 Accepted with a Location header for polling the result | true |
Output
The PDF4me Extract Form Data From PDF REST API returns different responses based on the processing mode. The API returns extracted form field data as a JSON response.
- Success Response
- Asynchronous Processing
- Error Responses
- Response Format Details
Synchronous Processing (Default)
When async is false or not provided, the API returns the extracted form data immediately.
Status Code: 200 OK
Response Format:
{
"formFields": [
{
"fieldName": "Name",
"fieldValue": "John Doe",
"fieldType": "text"
},
{
"fieldName": "Email",
"fieldValue": "[email protected]",
"fieldType": "text"
},
{
"fieldName": "Date",
"fieldValue": "2024-01-15",
"fieldType": "date"
}
]
}
The response contains an array of form fields with their names, values, and types.
Asynchronous Processing
When async is true, the API processes the document asynchronously.
Initial Response:
Status Code: 202 Accepted
Response Headers:
Location: https://api.pdf4me.com/api/v2/ExtractPdfFormDataStatus/{operationId}
Response Body:
{
"traceId": "operation-trace-id"
}
Polling for Results:
Use the Location header URL to poll for completion:
const response = await fetch(locationUrl, {
headers: { 'Authorization': 'Basic ' + apiKey }
});
// Continue polling until status code is 200
if (response.status === 200) {
const result = await response.json();
// Process extracted form data
}
Error Responses
| Status Code | Description | Example Response |
|---|---|---|
| 400 Bad Request | Invalid request parameters or missing required fields | {"error": "Missing required parameter: docContent"} |
| 401 Unauthorized | Invalid or missing API key | {"error": "Unauthorized"} |
| 408 Request Timeout | Request processing timeout | {"error": "Request timeout"} |
| 500 Internal Server Error | Server error during processing | {"error": "Internal server error"} |
Understanding the JSON Response
The form data extraction response is a JSON object containing an array of form fields:
- formFields: Array of form field objects
- fieldName: The name of the form field
- fieldValue: The value entered in the form field
- fieldType: The type of form field (text, checkbox, radio, date, etc.)
Field Types:
Common form field types include:
text: Text input fieldscheckbox: Checkbox fields (boolean values)radio: Radio button groupsdate: Date picker fieldsnumber: Numeric input fieldsdropdown: Dropdown/select fieldssignature: Digital signature fields
Request Example
Header
Content-Type: application/json
Authorization: Basic YOUR_BASE64_ENCODED_API_KEY
Note:
- Get your API key from the PDF4me Dashboard
- The API key must be Base64 encoded and prefixed with "Basic " in the Authorization header
- Example: If your API key is
abc123, encode it to Base64 and useAuthorization: Basic YWJjMTIz
Payload
Basic Request:
{
"docContent": "JVBERi0xLjQKJeLjz9MKMSAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFnZXMgMiAwIFIKPj4KZW5kb2JqCjIgMCBvYmoKPDwKL1R5cGUgL1BhZ2VzCi9LaWRzIFszIDAgUl0KL0NvdW50IDEKPD4KZW5kb2JqCjMgMCBvYmoKPDwKL1R5cGUgL1BhZ2UKL1BhcmVudCAyIDAgUgovTWVkaWFCb3ggWzAgMCA2MTIgNzkyXQovUmVzb3VyY2VzIDw8Ci9Gb250IDw8Ci9GMSA0IDAgUgo+Pgo+PgovQ29udGVudHMgNSAwIFIKPj4KZW5kb2JqCjQgMCBvYmoKPDwKL1R5cGUgL0ZvbnQKL1N1YnR5cGUgL1R5cGUxCi9CYXNlRm9udCAvSGVsdmV0aWNhCj4+CmVuZG9iago1IDAgb2JqCjw8Ci9MZW5ndGggNDQKPj4Kc3RyZWFtCkJUCi9GMSAxMiBUZgoxMDAgNzAwIFRkCihIZWxsbyBXb3JsZCkgVGoKRVQKZW5kc3RyZWFtCmVuZG9iagp4cmVmCjAgNgowMDAwMDAwMDAwIDY1NTM1IGYgCjAwMDAwMDAwMDkgMDAwMDAgbiAKMDAwMDAwMDA1NCAwMDAwMCBuIAowMDAwMDAwMTAxIDAwMDAwIG4gCjAwMDAwMDAxNzAgMDAwMDAgbiAKMDAwMDAwMDI0NCAwMDAwMCBuIAp0cmFpbGVyCjw8Ci9TaXplIDYKL1Jvb3QgMSAwIFIKPj4Kc3RhcnR4cmVmCjM0MQolJUVPRg==",
"docName": "output.pdf"
}
With Asynchronous Processing:
{
"docContent": "JVBERi0xLjQKJeLjz9MK...",
"docName": "output.pdf",
"async": true
}
Code Samples
The PDF4me Extract Form Data From PDF REST API provides code samples in multiple programming languages. Choose the language that best fits your development environment:
- C#
- Java
- JavaScript
- Python
- Salesforce
- n8n
- Google Script
- AWS Lambda
Google Script Sample
Google Apps Script implementation for Google Workspace integration:
Form Data Extraction Features
Field Analysis Capabilities
- Complete Field Detection: Identify all form fields including hidden and calculated fields
- Value Extraction: Extract current values and default values from all field types
- Property Analysis: Retrieve field properties including names, types, positions, and validation rules
- Data Validation: Verify field data integrity and format compliance
Form Processing
- PDF Form Analysis: Deep analysis of PDF form structure and field relationships
- Field Type Recognition: Automatic identification of various form field types
- Batch Processing: Process multiple forms and extract data from complex form structures
- Structured Output: Generate organized, structured data for easy integration
Advanced Features
- Field Mapping: Map form fields to database columns or application fields
- Data Transformation: Convert form data to various output formats (JSON, XML, CSV)
- Validation Rules: Extract and apply form validation rules and constraints
- Professional Results: High-quality extraction suitable for enterprise applications
Industry Use Cases & Applications
- Document Management & Processing
- Business & Finance
- Legal & Compliance
- Healthcare & Medical
- Government & Public Sector
- Education & Research
Document Management & Processing Use Cases
- Form Data Collection: Extract data from filled PDF forms for database storage
- Document Digitization: Convert paper forms to digital data through PDF processing
- Data Migration: Extract form data during document migration and system updates
- Form Analysis: Analyze form completion rates and field usage patterns
Business & Finance Use Cases
- Invoice Processing: Extract data from invoice forms and purchase orders
- Application Processing: Process loan applications, insurance claims, and registration forms
- Survey Data: Extract responses from customer surveys and feedback forms
- Compliance Forms: Process regulatory compliance forms and tax documents
Legal & Compliance Use Cases
- Legal Forms: Extract data from legal documents and court forms
- Contract Processing: Process contract forms and agreement data
- Regulatory Compliance: Extract data from compliance forms and regulatory documents
- Evidence Collection: Extract form data for legal evidence and documentation
Healthcare & Medical Use Cases
- Patient Forms: Extract data from patient registration and medical history forms
- Insurance Claims: Process insurance claim forms and medical billing documents
- Clinical Trials: Extract data from research forms and clinical study documents
- Medical Records: Process medical forms and patient information documents
Government & Public Sector Use Cases
- Citizen Services: Extract data from government forms and citizen applications
- Tax Processing: Process tax forms and financial disclosure documents
- Permit Applications: Extract data from permit and license application forms
- Public Records: Process public records and official government documents
Education & Research Use Cases
- Student Forms: Extract data from enrollment forms and academic applications
- Research Data: Process research forms and survey responses
- Assessment Forms: Extract data from evaluation and assessment forms
- Administrative Forms: Process administrative forms and institutional documents