Extract Metadata from Word using Zapier
PDF4me Extract Metadata pulls title, author, subject, keywords, creation/modification dates, page count, word count, and custom properties from Word docs as structured JSON. Ideal for document cataloging, compliance tracking, CMS population, or routing based on metadata—without opening files.
Authenticating Your API Request
To access the PDF4me Web API through Zapier, every request must include proper authentication credentials. Authentication ensures secure communication and validates your identity as an authorized user, enabling seamless integration between your Zapier workflows and PDF4me's powerful Word metadata extraction services.

Configuration at a Glance
File and File Name mapped from previous step (Drive, Dropbox, or trigger). Returns Metadata with Title, Author, Subject, Keywords, CreatedDate, ModifiedDate, PageCount, WordCount; Culture Name optional.
Map from step
document.docx
:::tip Map File from previous steps Use the + button next to File and File Name to map from earlier Zap steps (e.g., Google Drive, Dropbox, or a trigger). The Word file must provide full content, not "Exists but not shown" references. :::
:::warning File: (Exists but not shown) If you see "File: (Exists but not shown)" in the File field and get errors, select the option that provides the full file content instead. See Zapier & Power Automate Tips for details. :::
Key Features
- Built-in Properties: Extract title, author, subject, keywords, company
- Date Information: Creation date, modification date, last print date
- Document Statistics: Page count, word count, character count
- Custom Properties: Extract user-defined custom document properties
- JSON Output: Structured JSON format for easy integration
Parameters
Complete list of parameters for the Extract Metadata action. Parameter names match the Zapier configuration UI.
- Map File and File Name from previous step
Important: Parameters marked with an asterisk (***) are required and must be provided for the action to function correctly.
| Parameter | Type | Description | Example |
|---|---|---|---|
| File*** | File | Word document—map from previous step (Drive, Dropbox, trigger) | [4. File from Step 4] |
| File Name | String | Word filename—with .docx or .doc extension | document.docx |
| Culture Name | String | Locale—optional (e.g. en-US) | en-US |
Output
The PDF4me Extract Metadata action returns a Metadata object (not a file) containing document properties. Map individual metadata fields to your next Zap steps for cataloging, compliance, or CMS integration.
Key output fields (map these to your next step)
JSON object with Title, Author, Subject, Keywords, CreatedDate, ModifiedDate, PageCount, WordCount, and custom properties. Map nested fields (e.g., Metadata.Title, Metadata.Author) to next steps.
- Table
- JSON
- Zap Integration
| Parameter | Type | Description |
|---|---|---|
| Job Id | String | Unique identifier for the Zapier job execution |
| Metadata | Object | JSON object with document properties |
| Metadata.Title | String | Document title |
| Metadata.Author | String | Document author |
| Metadata.Subject | String | Document subject |
| Metadata.Keywords | String | Document keywords |
| Metadata.CreatedDate | String | Creation date (ISO 8601) |
| Metadata.ModifiedDate | String | Last modification date |
| Metadata.PageCount | Number | Total page count |
| Metadata.WordCount | Number | Total word count |
Example JSON output:
{
"Job Id": "...",
"Metadata": {
"Title": "Q4 Financial Report",
"Author": "John Doe",
"Subject": "Financial Analysis",
"Keywords": "finance, Q4, report",
"CreatedDate": "2024-01-15T10:30:00Z",
"ModifiedDate": "2024-01-20T15:45:00Z",
"PageCount": 25,
"WordCount": 5280
}
}
Use + to map Metadata.Title, Metadata.Author, Metadata.ModifiedDate, or other nested fields to the next step—database, CMS, Google Sheets, or filtering logic.
Scenario Examples
- Document Indexing
- Compliance Tracking
- CMS Integration
Automated Document Catalog Indexing Workflow
Complete Scenario Steps:
- Trigger: New document uploaded to library
- Get Document: Retrieve Word file
- Extract Metadata: Get all document properties
- Parse JSON: Extract title, author, keywords
- Create Index Entry: Insert metadata to catalog database
- Tag Document: Apply tags based on keywords
- Update Search Index: Add document to search system
- Email Cataloging Team: Send indexing confirmation
- Archive Original: Store in indexed archive
- Log Event: Record catalog entry creation
Business Benefits:
- Catalogs 500+ documents monthly automatically
- Metadata extraction enables searchability
- Automated indexing eliminates manual data entry
- Reduces cataloging time from 10 minutes to 30 seconds
Document Compliance Audit Trail Workflow
Complete Scenario Steps:
- Trigger: Quarterly compliance audit
- List Controlled Documents: Get all from compliance folder
- Extract Each Metadata: Loop through documents
- Check Modification Dates: Verify last modified within policy
- Check Authors: Validate authorized authors
- Generate Audit Report: Create compliance report
- Flag Violations: Identify out-of-compliance documents
- Email Compliance: Send audit results
- Archive Report: Store audit evidence
- Update Compliance Log: Record audit event
Business Benefits:
- Audits 200+ documents quarterly
- Metadata tracking provides compliance evidence
- Automated audit reduces manual review time by 90%
- Complete audit trail for regulators
Automated CMS Population Workflow
Complete Scenario Steps:
- Trigger: Document uploaded to staging folder
- Get Document: Retrieve Word file
- Extract Metadata: Get properties and stats
- POST to CMS API: Send metadata to content management
- Create CMS Record: Title, author, keywords indexed
- Upload to CMS Storage: Save document to CMS
- Link Metadata: Associate metadata with file
- Email Content Team: Notify of new content
- Move to Archive: File processed to archive
- Log Integration: Record CMS integration event
Business Benefits:
- Integrates 300+ documents monthly to CMS
- Automated metadata population improves search
- Eliminates manual CMS form filling
- Reduces content onboarding from 15 to 2 minutes
Industry Use Cases & Applications
- Compliance & Audit
- Content Management
- Legal & Professional
- Publishing & Documentation
- Document Audit Trails: Extract modification dates and authors for compliance audit evidence
- Version Control Tracking: Monitor document versions using creation and modification timestamps
- Author Verification: Validate document authors match authorized personnel lists
- Retention Policy Enforcement: Check creation dates against retention schedules for archival decisions
- Controlled Document Management: Track document properties for ISO/quality management compliance
- CMS Integration: Populate content management systems with document titles, authors, keywords
- Search Indexing: Extract keywords and subject for full-text search engine indexing
- Document Classification: Categorize documents based on subject and keyword metadata
- Library Cataloging: Index corporate document libraries using extracted properties
- Tag Generation: Auto-generate document tags from keywords and subject fields
- Matter Document Tracking: Extract metadata for legal matter document management systems
- Discovery Document Indexing: Catalog discovery documents by author, date, subject for review
- Contract Management: Extract contract parties, dates, subjects for contract database population
- Privilege Log Generation: Use document metadata to auto-generate privilege log entries
- Document Production: Track production document metadata for litigation database systems
- Manuscript Tracking: Extract author, title, word count for manuscript management systems
- Publishing Workflow: Route documents based on author and modification date metadata
- Style Guide Compliance: Check document properties match publishing style guide requirements
- Content Analytics: Analyze document statistics (word count, page count) for content planning
- Multi-Author Coordination: Track document authors and modification dates in collaborative projects