Split Document - Document Division API
PDF4me Split Document enables you to divide Word documents into multiple smaller documents with flexible splitting options and comprehensive configuration control. This API service processes Word files and splits them by pages, sections, headings, or custom page ranges with full control over split criteria, document naming, and output organization. The API receives Word document content through REST API calls, utilizing Base64 encoding for secure transmission. With support for page-based splitting, section-based division, heading-based separation, and custom page ranges, this solution is ideal for document management, content distribution, and workflow automation.
Authenticating Your API Request
To access the PDF4me REST API, every request must include proper authentication credentials. Authentication ensures secure communication and validates your identity as an authorized user of the REST API.
Key Features
- Multiple Split Types: Split by pages, sections, headings, or custom ranges
- Page Range Support: Split based on specific page numbers or ranges
- Section-Based Splitting: Split the document after every Word section
- Heading-Based Division: Split the document after every heading of the specified style (e.g. Heading 1, Heading 2)
- Flexible Configuration: Custom split criteria with splitConfig parameter
- Unique Document Names: Each split document gets a unique GUID filename
- Format Preservation: Maintains original document formatting and structure
- Culture Support: Locale-specific document processing
REST API Endpoint
The PDF4me REST API uses standard HTTP methods to interact with resources. All document splitting operations are performed through a single endpoint:
- Method: POST
- Endpoint:
office/ApiV2Word/SplitDocument
REST API Parameters
Complete list of parameters for the Split Document REST API. Parameters are organized by category for better understanding and implementation.
Important: Parameters marked with an asterisk (*) are required. The
splitConfigparameter is required whensplitTypeis "array", "numberofpages", or "headings".
Required Parameters
| Parameter | Type | Description | Example |
|---|---|---|---|
| document* | Object | Document reference. Must contain Name (string) — Word file name with .docx extension. Used for reference and processing | { "Name": "document.docx" } |
| docContent* | Base64 | Word document content encoded in Base64. Document is split based on specified criteria. Must be valid Word document (.docx, .doc formats) | base64EncodedDocumentContent |
Optional Parameters
| Parameter | Type | Description | Example |
|---|---|---|---|
| splitType | String | Type of document splitting. Options: allpages - Split into individual pages (default), array - Split by explicit page ranges (e.g. "1-3,4,5-7"), numberofpages - Split after every N pages (config: positive integer like "3" or "5"), section - Split the document after every section, headings - Split the document after every heading of the given style (e.g. "Heading 1"). Each split document gets a unique GUID filename | "allpages" |
| splitConfig | String | Configuration for splitting. For array: Page ranges like "1-3,4,5-7" (pages 1-3, page 4, pages 5-7) or single pages "1,3,5". For numberofpages: A positive integer like "3" or "5" — splits after every N pages. For headings: Heading style name like "Heading 1" or "Heading 2". Required when splitType is "array", "numberofpages", or "headings". Page numbers are 1-based. Heading matching is case-insensitive | "1-3,4,5-7" or "3" or "Heading 1" |
| cultureName | String | Culture code for document processing (e.g., "en-US", "de-DE", "fr-FR"). Default: "en-US". Affects document language and formatting. Used for metadata and localization handling | en-US |
Split Type Options
The API provides different split types for document division:
| Split Type | Description | Requires splitConfig | Use Case |
|---|---|---|---|
| allpages | Split into individual pages (default) | No | Extract each page as separate document |
| array | Split by explicit page ranges. Each comma-separated range becomes one document (e.g. "1-3,4,5-7" → 3 documents) | Yes | Create documents from specific page ranges or single pages |
| numberofpages | Split after every N pages. Splits sequentially: every N pages form one document; the remainder is the last document | Yes | Split into equal-sized chunks (e.g. every 3 or 5 pages) |
| section | Split the document after every Word section | No | One document per section |
| headings | Split the document after every heading of the specified style (e.g. "Heading 1") | Yes | Extract chapters or parts by heading level |
Split Configuration Examples
Understanding split configuration helps optimize your document splitting workflows:
array — explicit page ranges
"1-3,4,5-7"→ 3 documents: pages 1-3, page 4, pages 5-7"1,3,5"→ 3 documents: page 1, page 3, page 5"1-5,6-10"→ 2 documents: pages 1-5, pages 6-10
numberofpages — split after every N pages
Behavior: Splits sequentially after every N pages.
Config: Required — format is a single positive integer (e.g. "3", "5").
Examples:
- Config
"3"on a 10-page document → 4 documents:
Document 1: pages 1-3, Document 2: pages 4-6, Document 3: pages 7-9, Document 4: page 10 - Config
"5"on a 12-page document → 3 documents:
Document 1: pages 1-5, Document 2: pages 6-10, Document 3: pages 11-12
headings — split after every heading
"Heading 1"— one document per block that ends at the next Heading 1 (or end of document)"Heading 2"— one document per block that ends at the next Heading 2 (or end of document)- Case-insensitive matching
section
No config required. The document is split after every Word section break; each section becomes one document.
Output
The PDF4me Split Document REST API returns the split result as JSON. The API returns a single document in the response (the first split document), with standard response fields.
- Success Response
- Asynchronous Processing
- Error Responses
- Response Format Details
Synchronous Processing (Default)
The API processes the request and returns the first split document:
Status Code: 200 OK
Content-Type: application/json
Response Body:
{
"document": "UEsDBBQABgAIAAAAIQDfpNJsWgEAACAFAAATAAgCW0NvbnRlbnRfVHlwZXNdLnhtbCCiBAIooAAC...",
"fileName": "a1b2c3d4-e5f6-7890-abcd-ef1234567890.docx",
"success": true,
"errorMessage": null
}
Response Fields:
- document (string): The first split document content, encoded as Base64 string
- fileName (string): Generated filename for the split document (GUID-based)
- success (boolean): Indicates whether the request succeeded
- errorMessage (string or null): Error details when success is false
How to Use:
- Extract the
documentfield from the JSON response (Base64) - Decode the Base64 string to get the binary Word document data
- Use the
fileNameto save the split document
Example (JavaScript):
const response = await fetch(url, options);
const data = await response.json();
const wordBytes = atob(data.document); // Decode Base64
// Save with data.fileName or custom name
Asynchronous Processing
Asynchronous behavior (202 Accepted with polling) is controlled by server configuration, not by a request body parameter. When enabled, the API may return a 202 status with a polling URL in the Location header. Poll the URL with GET requests until you receive 200 OK with the same response shape (document, fileName, success, errorMessage).
Error Responses
The API returns standard HTTP error codes with error details:
- Invalid request parameters
- Missing required fields (
documentwithName,docContent) - Invalid split configuration provided
splitConfigis required whensplitTypeis "array", "numberofpages", or "headings"- Invalid Base64 encoding in
docContent - Invalid or corrupted Word document
- No headings found in document (for heading-based splitting)
- Invalid page range specified (page numbers exceed document page count)
- Invalid split type
- Invalid or missing API key
- API key not properly Base64 encoded in Authorization header
- Missing
Authorization: Basicheader
- Server-side processing error
- Word document processing failure
- Error splitting document
- Error loading document from bytes
- Error converting document to bytes
Error Response Format:
{
"error": "Error message describing what went wrong"
}
Response Format Details
Important: The API always returns JSON with an array of documents, never binary Word data directly.
Response Structure:
{
"document": "string", // Base64-encoded Word document content (first split)
"fileName": "string", // GUID-based filename (e.g., "a1b2c3d4-e5f6-7890-abcd-ef1234567890.docx")
"success": true,
"errorMessage": "string or null"
}
Content-Type Header:
- Success:
application/json - The split document is embedded as a Base64 string within the JSON response
Why Base64?
- JSON-safe encoding for binary data
- Easy to transmit over HTTP
- Compatible with all programming languages
- Can be directly embedded in JSON without escaping issues
Decoding Base64 to Word Document:
JavaScript/Node.js:
const base64 = response.document;
const binary = atob(base64); // Browser
// OR
const binary = Buffer.from(base64, 'base64'); // Node.js
// Save with response.fileName or custom name
Python:
import base64
response_data = response.json()
word_bytes = base64.b64decode(response_data['document'])
filename = response_data['fileName']
with open(filename, 'wb') as f:
f.write(word_bytes)
C#:
byte[] wordBytes = Convert.FromBase64String(response.document);
File.WriteAllBytes(response.fileName, wordBytes);
Request Example
Header
Content-Type: application/json
Authorization: Basic YOUR_BASE64_ENCODED_API_KEY
Note:
- Get your API key from the PDF4me Dashboard
- The API key must be Base64 encoded and prefixed with "Basic " in the Authorization header
- Example: If your API key is
abc123, encode it to Base64 and useAuthorization: Basic YWJjMTIz
Payload
Basic Example (Split All Pages - Default):
{
"document": { "Name": "document.docx" },
"docContent": "base64EncodedDocumentContent",
"splitType": "allpages"
}
Page Range Example (array):
{
"document": { "Name": "document.docx" },
"docContent": "base64EncodedDocumentContent",
"splitType": "array",
"splitConfig": "1-3,4,5-7"
}
Split After Every N Pages Example (numberofpages):
{
"document": { "Name": "document.docx" },
"docContent": "base64EncodedDocumentContent",
"splitType": "numberofpages",
"splitConfig": "3"
}
Heading-Based Example (split after every heading):
{
"document": { "Name": "document.docx" },
"docContent": "base64EncodedDocumentContent",
"splitType": "headings",
"splitConfig": "Heading 1",
"cultureName": "en-US"
}
Section-Based Example (split after every section):
{
"document": { "Name": "document.docx" },
"docContent": "base64EncodedDocumentContent",
"splitType": "section",
"cultureName": "en-US"
}
Code Samples
The PDF4me Split Document REST API provides code samples in multiple programming languages. Choose the language that best fits your development environment:
- C#
- Java
- JavaScript
- Python
- Salesforce
- n8n
- Google Script
- AWS Lambda
Google Script Sample
Google Apps Script implementation for Google Workspace integration:
Industry Use Cases & Applications
- Legal & Professional Services
- Business & Enterprise
- Education & Research
- Finance & Banking
Legal & Professional Services Use Cases
- Contract Management: Split multi-party contracts into individual sections
- Case Documentation: Extract specific pages from case files
- Compliance Reports: Distribute relevant sections to different departments
- Legal Research: Split large legal documents by topic or section
Business & Enterprise Use Cases
- Report Distribution: Split reports by section for different stakeholders
- Proposal Management: Extract sections for client-specific deliverables
- Internal Documentation: Divide manuals by topic or department
- Archive Organization: Create logical document chunks for storage and retrieval
Education & Research Use Cases
- Curriculum Materials: Split textbooks into individual chapters
- Student Records: Extract specific pages for different academic departments
- Research Papers: Split large papers by section for peer review
- Medical and Research Documentation: Split research or clinical documents by methodology or section
Finance & Banking Use Cases
- Financial Reports: Distribute specific sections to different stakeholders
- Audit Documentation: Extract relevant pages for different audit areas
- Regulatory Compliance: Split compliance documents by regulation type
- Risk Assessment: Distribute risk sections to appropriate teams