Extract Metadata From Word Document using n8n action
PDF4me Extract Metadata From Word Document enables extracting comprehensive metadata and properties from Word documents through n8n automation workflows with detailed document analysis capabilities. This powerful metadata extraction feature supports built-in document properties, custom properties, document statistics, author information, creation dates, and revision tracking with culture-specific formatting and localization support, perfect for document management, compliance tracking, and content analysis workflows.
Setup
Add the PDF4me "Extract Metadata From Word Document" node to your n8n workflow and configure the required parameters. For initial setup instructions, see our n8n Integration Guide.
Prerequisites:
- PDF4me API credentials
- n8n workflow access
Configuration:
- Add PDF4me node to workflow
- Select "Extract Metadata From Word Document" action
- Configure input parameters (see below)

Parameters
Complete list of parameters for the Extract Metadata From Word Document action. Configure these parameters to control metadata extraction behavior.
Important: Parameters marked with an asterisk (***) are required. Advanced parameters provide fine-grained control over metadata formatting.
| Parameter | Type | Description | Example |
|---|---|---|---|
| Input Data Type*** | String | Word Document Input Format Selection • Choose the format of your Word document data input • PDF4me supports multiple input types • Options: Binary Data, Base64 String, or URL | Binary Data |
| Input Binary Field*** | Binary | Binary Word File Input (Required if Binary Data) • Reference Word file (.docx, .doc) from previous n8n node or file upload • PDF4me processes binary Word files with automatic format detection • Required when Input Data Type is "Binary Data" | {{ $binary.data }} |
| Base64 Word Content*** | String | Base64 Encoded Word Input (Required if Base64 String) • Provide Word content (.docx, .doc) as base64 encoded string for secure transmission • PDF4me automatically decodes and processes the Word content • Required when Input Data Type is "Base64 String" | UEsDBBQABgAI... |
| Word Document URL*** | String | Public Word Document URL Input (Required if URL) • Provide a public/open permission URL to the Word file (.docx, .doc) to be processed • PDF4me downloads and processes the Word file from the provided URL • Required when Input Data Type is "URL" | https://abc.com/document.docx |
| Word Document Name*** | String | Word Document Input Filename • Specify the name of the input Word file with proper extension (.docx, .doc) • PDF4me uses this for format detection and processing optimization | document.docx |
| Culture Name | String | Document Culture/Locale • Culture code for date/time formatting (e.g., "en-US", "de-DE", "fr-FR") • Default: InvariantCulture (consistent formatting) • Affects date/time display format in metadata | en-US |
Output
Output Parameters
| Parameter | Type | Description | Example |
|---|---|---|---|
| fileName | String | Original Word document filename - The name of the input Word document file | myWordFile.docx |
| success | Boolean | PDF4me operation status - Boolean flag indicating the success or failure of the metadata extraction operation. PDF4me returns true for successful operations and false for any errors | true |
| cultureName | String | Culture/Locale code - Culture code used for date/time formatting (e.g., "en-US", "de-DE", "fr-FR") | en-US |
| metadata | Object | Nested metadata object - Contains document metadata structure with nested metadata object containing all document properties including Author, Title, Subject, Keywords, Comments, Category, Company, Manager, Created, LastModified, LastPrinted, RevisionNumber, TotalEditingTime, Pages, Words, Characters, Paragraphs | {"metadata": {"Author": "Ynoox Testtwo", "Company": "HP Inc.", "Words": 921}} |
| message | String | Operation message - Descriptive message indicating the result of the metadata extraction operation | Word metadata extracted successfully |
N8N Action Response
The PDF4me Extract Metadata From Word Document API returns a response that can be viewed in multiple formats. Choose the view that best fits your needs:
- JSON
- Table
- Schema
JSON Response Format
The raw JSON response from the API:
[
{
"fileName": "myWordFile.docx",
"success": true,
"cultureName": "en-US",
"metadata": {
"document": null,
"fileName": null,
"success": true,
"errorMessage": null,
"metadata": {
"Author": "Ynoox Testtwo",
"Title": "",
"Subject": "",
"Keywords": "",
"Comments": "",
"Category": "",
"Company": "HP Inc.",
"Manager": "",
"Created": "5/16/2019 6:04:00 AM",
"LastModified": "8/7/2024 7:04:00 AM",
"LastPrinted": "5/16/2019 6:04:00 AM",
"RevisionNumber": 5,
"TotalEditingTime": 2,
"Pages": 3,
"Words": 921,
"Characters": 5256,
"Paragraphs": 12
}
},
"message": "Word metadata extracted successfully"
}
]
Table View
Response data in a structured table format:
| Parameter | Value |
|---|---|
| fileName | myWordFile.docx |
| success | true |
| cultureName | en-US |
| metadata | Nested object containing document metadata |
| message | Word metadata extracted successfully |
Metadata Object Structure:
| Parameter | Value |
|---|---|
| document | null |
| fileName | null |
| success | true |
| errorMessage | null |
| metadata | Object containing document properties |
Document Properties (metadata.metadata):
| Property | Value |
|---|---|
| Author | Ynoox Testtwo |
| Title | "" |
| Subject | "" |
| Keywords | "" |
| Comments | "" |
| Category | "" |
| Company | HP Inc. |
| Manager | "" |
| Created | 5/16/2019 6:04:00 AM |
| LastModified | 8/7/2024 7:04:00 AM |
| LastPrinted | 5/16/2019 6:04:00 AM |
| RevisionNumber | 5 |
| TotalEditingTime | 2 |
| Pages | 3 |
| Words | 921 |
| Characters | 5256 |
| Paragraphs | 12 |
Schema View
The data structure and types of the response:
1 item
fileName: T myWordFile.docx
success: ☑ true
cultureName: T en-US
metadata: {} Object
document: null
fileName: null
success: ☑ true
errorMessage: null
metadata: {} Object
Author: T Ynoox Testtwo
Title: T ""
Subject: T ""
Keywords: T ""
Comments: T ""
Category: T ""
Company: T HP Inc.
Manager: T ""
Created: T 5/16/2019 6:04:00 AM
LastModified: T 8/7/2024 7:04:00 AM
LastPrinted: T 5/16/2019 6:04:00 AM
RevisionNumber: # 5
TotalEditingTime: # 2
Pages: # 3
Words: # 921
Characters: # 5256
Paragraphs: # 12
message: T Word metadata extracted successfully
Type Indicators:
T= String (Text)#= Number☑= Boolean[]= Array{}= Objectnull= Null value
Use Cases
Enterprise Document Automation
- Document Management: Extract metadata for document organization and cataloging
- Compliance Tracking: Track document properties for regulatory compliance
- Content Analysis: Analyze document statistics and properties
- Document Search: Use metadata for document search and retrieval
AI-Powered Document Processing
- Multi-Format Support: Process Word documents in .docx and .doc formats
- Multi-Language Processing: Support for different languages and regions
- Intelligent Metadata Extraction: Automatic extraction of all document properties
Business Intelligence and Analytics
- Document Analytics: Analyze document properties and statistics
- Compliance Monitoring: Ensure documents meet metadata requirements
- Performance Metrics: Track processing accuracy and efficiency