Extract Resources from PDF in Make
What this module does
PDF4me — Extract Resources pulls out all embedded images and text content from a PDF and returns them as structured data in your Make scenario. Toggle Extract Images to retrieve every image embedded in the document as a named, Base64-encoded file. Toggle Extract Text to retrieve the full text content as a plain string. Both can run simultaneously in a single module call. Use extracted images to build asset libraries, use extracted text for analysis, search indexing, or translation — all without opening the PDF manually.
Authenticating Your API Request
Every PDF4me module in Make requires a valid Connection. Create or select one that holds your PDF4me API key so the scenario can authenticate extraction requests securely.
Important Facts You Should Not Miss
Name field (filename) and a Data field (Base64 content). Without an Iterator, you cannot access individual images downstream.Data field is Base64-encoded. Cloud storage modules (Dropbox, Google Drive, SharePoint) expect binary data, not Base64 text. Use Make's built-in toBinary(Data, 'base64') function to convert the Data field to binary before piping it into an upload module. The Name field gives you the original filename with extension (e.g. image_001.png).
Extract Images and Extract Text are independent toggles — enable one, both, or either depending on what you need. Images are returned as an array; Text is returned as a plain string.
Parameters
Required: Connection, File Name, Document, Extract Images, and Extract Text must all be provided. Set at least one of Extract Images or Extract Text to Yes — enabling neither returns no content.
| Parameter | Required | What it does | Example |
|---|---|---|---|
| Connection | Yes | PDF4me API connection. Click Add and paste your API key if connecting for the first time. | My PDF4me connection |
| File Name | Yes | Filename of the source PDF including .pdf extension. Map from the prior module's file name output. | catalog.pdf |
| Document | Yes | Binary content of the source PDF. Map from the prior module's data output — Dropbox, Google Drive, HTTP, or email attachment. | 1. Data |
| Extract Images | Yes | Toggle to extract all embedded images. Set to Yes to retrieve images as a named Base64 array. Set to No to skip image extraction. Can be combined with Extract Text. | Yes |
| Extract Text | Yes | Toggle to extract all text content. Set to Yes to retrieve the full document text as a plain string. Set to No to skip text extraction. Works on native PDFs only — not scanned. | Yes |
Output Fields
| Field | Type | Description |
|---|---|---|
| Texts | String | Full text content extracted from the PDF as a plain string. Empty if Extract Text is set to No or the PDF is scanned. |
| Images | Array | Array of all embedded images. Each item has a Name (filename with extension) and Data (Base64-encoded image content). Empty array if Extract Images is No. |
Quick Setup
- Add PDF4me → Extract Resources to your Make scenario after a file download step.
- Select Connection (or click Add to create one with your API key).
- Map File Name and Document from the prior module.
- Set Extract Images and/or Extract Text to Yes depending on what you need.
- If extracting images, add an Iterator module pointing at the Images array. Inside the iterator, use
toBinary(Data, 'base64')to decode each image and pipe it into an upload module. - If extracting text, map the Texts field directly into a Google Sheets row, database insert, or HTTP POST body.
Workflow Examples
Workflow ExamplesCommon Make scenario patterns using Extract Resources.
- Dropbox Watch Folder triggers when a new catalog PDF is uploaded.
- Dropbox Download a File retrieves the PDF binary.
- Extract Resources runs with Extract Images set to Yes, Extract Text set to No.
- Iterator loops through the Images array — one image per iteration.
- Each iteration decodes the Data field with
toBinary(Data, 'base64')and uploads the image to a Google Drive "Product Images" folder using the Name field as the filename.
- Google Drive Watch Files triggers when a new contract PDF is added to the "Contracts" folder.
- Google Drive Get a File downloads the binary.
- Extract Resources runs with Extract Text set to Yes, Extract Images set to No.
- The Texts output is written to an Airtable record alongside the filename and upload date.
- The contract is now full-text searchable in Airtable — no manual copy-paste needed.
- A Google Sheets row with a PDF URL triggers the scenario for each legacy document in the migration list.
- An HTTP module downloads the PDF from the URL.
- Extract Resources runs with both Extract Text and Extract Images set to Yes.
- The Texts field is inserted into a MySQL database record for full-text search.
- An Iterator uploads each extracted image to a SharePoint media library, completing the content migration without ever opening the PDFs manually.