Skip to main content

Extract Attachment From PDF using n8n action

PDF4me Extract Attachment From PDF extracts embedded files from PDF documents through n8n automation workflows. Process PDFs via n8n triggers, binary data, base64 strings, or public URLs to automatically detect and extract attached documents, images, spreadsheets, data files, and other embedded content with preserved file integrity, metadata retention, and multi-format support (PDF, DOC, XLS, images, etc.). This solution is ideal for document analysis, file recovery, evidence extraction, attachment management, data extraction workflows, and comprehensive document processing that require reliable embedded file extraction with seamless integration.

Setup

Add the PDF4me "Extract Attachment From PDF" node to your n8n workflow and configure the required parameters. For initial setup instructions, see our n8n Integration Guide.

Prerequisites:

  • PDF4me API credentials
  • n8n workflow access

Configuration:

  1. Add PDF4me node to workflow
  2. Select "Extract Attachment From PDF" action
  3. Configure input parameters (see below)
Extract Attachment From PDF

Parameters

Complete list of parameters for the Extract Attachment From PDF action. Configure these parameters to control attachment extraction.

Important: Parameters marked with an asterisk (***) are required and must be provided for the action to function correctly.

ParameterTypeDescriptionExample
Input Data Type***StringPDF Input Format Selection
• Choose the format of your PDF data input
• PDF4me supports multiple input types
• Options: Binary Data, Base64 String, or URL
Binary Data
Input Binary FieldBinaryBinary PDF File Input (Required if Binary Data)
• Reference PDF file from previous n8n node or file upload
• PDF4me processes binary PDF files with automatic format detection
• Required when Input Data Type is "Binary Data"
{{ $binary.data }}
Base64 Document ContentStringBase64 Encoded PDF Input (Required if Base64 String)
• Provide PDF data as base64 encoded string
• PDF4me automatically decodes and processes the PDF content
• Required when Input Data Type is "Base64 String"
UEsDBBQABgAI...
File URLStringPublic PDF URL Input (Required if URL)
• Provide a public/open permission URL to the PDF file
• PDF4me downloads and processes the file from URL
• Required when Input Data Type is "URL"
https://abc.com/sample.pdf
Document Name***StringSource PDF Reference
• Specify the name of the source PDF file
• For reference and tracking purposes in extraction
• Helps with processing tracking
document.pdf

Advanced Options

The following parameters are available in the Advanced Options section and are optional:

ParameterTypeDescriptionExample
Custom ProfilesObjectExtraction Configuration
• Advanced configuration object for customizing extraction behavior
• Supports output formats and processing options
• JSON format for flexible parameter specification
• Optional for specialized requirements
{"outputDataFormat": "json", "extractMetadata": true}

Supported Attachment Types

CategoryFile TypesDescription
DocumentsPDF, DOC, DOCX, TXT, RTF, ODTText documents and office files
ImagesJPG, JPEG, PNG, GIF, BMP, TIFF, SVGImage files and graphics
SpreadsheetsXLS, XLSX, CSV, ODSData files and spreadsheets
DataJSON, XML, HTML, CSSWeb and data files

Output

Output Parameters

ParameterTypeDescriptionExample
documentsArrayPDF4me input document reference - Reference to the original input documents processed for extraction. Null for this operation typenull
outputDocumentsArrayPDF4me extracted attachments collection - Array of successfully extracted attachment objects containing file data, metadata, and properties from the source PDF document[{"fileName": "sample.txt", "barcodeText": null, "docText": null, "image": null}]
traceIdStringPDF4me processing trace identifier - Unique trace ID for tracking the extraction request through PDF4me's processing pipeline for debugging and audit purposesnull
jobIdStringPDF4me job identifier - Unique job ID for the extraction operation. Null for synchronous operations, populated for asynchronous processingnull
statusUrlStringPDF4me status tracking URL - URL for checking the status of asynchronous extraction operations. Null for synchronous operationsnull
subscriptionUsageObjectPDF4me subscription usage data - Information about API usage and subscription consumption for the extraction operationnull
_metadataObjectPDF4me operation metadata - Comprehensive metadata object containing processing information, timestamps, and operation details{"success": true, "message": "Attachments extracted successfully"}
_metadata.successBooleanPDF4me extraction status indicator - Boolean flag indicating the success or failure of the attachment extraction process. PDF4me returns true for successful extractions and false for any errorstrue
_metadata.messageStringPDF4me extraction status message - Human-readable status message providing details about the extraction process result. Includes success confirmation or error details for troubleshootingAttachments extracted successfully
_metadata.processingTimestampStringPDF4me processing timestamp - ISO 8601 timestamp indicating when the extraction operation was completed by PDF4me's extraction engine2025-09-23T20:04:09.290Z
_metadata.sourceFileNameStringPDF4me source document filename - Original filename of the PDF document that was processed for attachment extractiondocument.pdf
_metadata.operationStringPDF4me operation type - Type of operation performed by PDF4me's API, always "extractAttachmentFromPdf" for this actionextractAttachmentFromPdf

Extracted Attachment Object Structure

FieldTypeDescriptionExample
fileNameStringPDF4me extracted file name - Original filename of the extracted attachment as it appears in the source PDF documentsample.txt
barcodeTextStringPDF4me barcode content - Text content extracted from any barcodes found within the attachment. Null if no barcodes are presentnull
docTextStringPDF4me document text content - Text content extracted from the attachment file. Null if the attachment is not a text-based documentnull
imageStringPDF4me image data - Base64 encoded image data if the attachment is an image file. Null for non-image attachmentsnull

N8N Action Response

The PDF4me Extract Attachment From PDF API returns a response that can be viewed in multiple formats. Choose the view that best fits your needs:

JSON Response Format

The raw JSON response from the API:

[
{
"documents": null,
"outputDocuments": [
{
"fileName": "sample.txt",
"barcodeText": null,
"docText": null,
"image": null
}
],
"traceId": null,
"jobId": null,
"statusUrl": null,
"subscriptionUsage": null,
"_metadata": {
"success": true,
"message": "Attachments extracted successfully",
"processingTimestamp": "2025-09-23T20:04:09.290Z",
"sourceFileName": "document.pdf",
"operation": "extractAttachmentFromPdf"
}
}
]

Use Cases

Document Content Recovery

  • Embedded File Restoration: Use the outputDocuments array to recover files that were embedded within PDF documents, restoring access to original source materials
  • File Name Recovery: Extract original filenames from the fileName field to maintain proper file identification and organization
  • Lost Document Recovery: Restore documents that were previously embedded in PDFs but are no longer available in their original formats

Content Processing and Analysis

  • Attachment Analysis: Process individual files from the outputDocuments collection to analyze content, extract data, or perform specific operations on each file type
  • Barcode Processing: Utilize the barcodeText field to extract and process barcode information from attachments for inventory, tracking, or identification purposes
  • Text Content Extraction: Leverage the docText field to extract textual content from document attachments for search, analysis, or data processing workflows

Workflow Automation and Integration

  • File Distribution: Automatically distribute extracted attachments to appropriate systems, folders, or users based on file type and content analysis
  • Content Archiving: Use the _metadata.traceId and extraction results to maintain audit trails while organizing extracted files into structured archive systems
  • Multi-System Integration: Integrate extracted files with external document management systems, cloud storage platforms, or business applications

Get Help