Skip to main content

Extract Text and Images in Power Automate

PDF4me Extract Text and Images action retrieves all textual content and embedded images from PDF documents in Power Automate for comprehensive content extraction. This versatile feature offers independent control over text and image extraction, returning full text as string and images as array of objects, enabling content reuse, data analysis, image processing, and multi-format content distribution across Microsoft 365 workflows.

Authenticating Your API Request

To access the PDF4me Web API through Power Automate, every request must include proper authentication credentials. Authentication ensures secure communication and validates your identity as an authorized user, enabling seamless integration between your Power Automate flows and PDF4me's powerful content extraction services.

Extract Text and Images Power Automate

Key Features

  • Text Extraction: Retrieve all text content from PDF documents
  • Image Extraction: Extract all embedded images as separate files
  • Independent Control: Choose to extract text only, images only, or both
  • Array Output: Receive images as array for iteration and processing
  • Batch Processing: Extract content from multiple PDFs in workflows

Parameters

Complete list of parameters for the Extract Text and Images action. Configure these parameters to control content extraction.

Important: Parameters marked with an asterisk (***) are required and must be provided for the action to function correctly.

ParameterTypeDescriptionExample
File Content***BinarySource PDF File Content
• Map PDF file from previous action output
• Supports PDFs from SharePoint, OneDrive, email
• Can be dynamically retrieved from flow variables
• Must be valid PDF document
[File Content from Get File]
Name***StringPDF Document Name
• Source PDF file name with .pdf extension
• Used for processing identification
• Must include proper file extension
• Supports dynamic naming from flows
Document.pdf
Extract Images***BooleanImage Extraction Control
True - Extract all embedded images
False - Skip image extraction
• Set to false if images not required
• Reduces processing when only text needed
true
Extract Text***BooleanText Extraction Control
True - Extract all text content
False - Skip text extraction
• Set to false if text not required
• Reduces processing when only images needed
true

Output

The PDF4me Extract Text and Images action returns comprehensive output data for seamless Power Automate flow integration:

Table View

Response data in a structured table format:

ParameterTypeDescription
TextsStringComplete text content extracted from PDF
ImagesArrayList of extracted images returned as array of objects

Workflow Examples

The PDF4me Extract Text and Images action in Power Automate provides comprehensive workflow templates designed for real-world business scenarios:

Automated Content Separation and Archival

Transform your content management with automated extraction and storage:

Complete Workflow Steps:

  1. Trigger: Document uploaded to processing folder
  2. Get Document: Retrieve PDF from SharePoint
  3. Extract Content: Extract both text and images
  4. Save Text: Store extracted text in text file in archive
  5. Apply to Each Image: Iterate through extracted images
  6. Save Images: Store each image individually in images folder
  7. Create Index: Generate metadata with text and image references
  8. Update Database: Log extraction details and locations

Business Benefits:

  • Separates content from 200+ PDFs monthly
  • Enables independent text and image management
  • Reduces storage redundancy by 60%
  • Facilitates content reuse and repurposing

Industry Use Cases & Applications

Publishing & Media Use Cases

  • Content Extraction: Extract text and images for content reuse
  • Digital Asset Management: Separate and catalog document assets
  • Content Migration: Extract content for platform migration
  • Archive Management: Organize content separately for better access

Get Help