Skip to main content

Extract Text from Word

ExtractExtract Text from Word

The Extract Text from Word API extracts text from a Word document (.doc, .docx). You send the document as Base64 (docContent), docName, StartPageNumber, EndPageNumber, RemoveComments, RemoveHeaderFooter, AcceptChanges, and optionally async. The API returns JSON or a text file with the extracted text. Use the tester below to try it; more details are in the sections that follow.

Try the Extract Text from Word API

:::note Quick reference Endpoint: POST /api/v2/ExtractTextFromWord · Required: api-key, docContent, docName, StartPageNumber, EndPageNumber, RemoveComments, RemoveHeaderFooter, AcceptChanges :::

:::info Try it live Use the form below to send your API key, Word document (Base64), page range, and options (remove comments, header/footer, accept tracked changes). The response is JSON or text with extracted content. No code required—fill the fields and click Send request. :::

Loading API Tester...

Overview, parameters, and use cases

What is Extract Text from Word?

This endpoint extracts text from a Word document (.doc, .docx). You specify a page range (StartPageNumber, EndPageNumber) and options: RemoveComments (strip comments), RemoveHeaderFooter (strip headers/footers), AcceptChanges (apply tracked changes). The API returns JSON or a text file with the extracted text. Use it when you need plain text from Word for search, analysis, or migration.

Key features

  • Page rangeStartPageNumber and EndPageNumber limit extraction to specific pages.
  • Content filteringRemoveComments, RemoveHeaderFooter, AcceptChanges control what is included.
  • Formats – Supports .doc and .docx (Base64).
  • Async – Use async for large documents.

:::tip Best for Use when you need plain text from Word (e.g. for search, migration, or analysis). For PDF extraction use Extract Resources or Extract Text by Expression. :::

Prerequisites

Before using this endpoint, make sure you have:

  • A valid PDF4me API key (Get your API Key)
  • A Word document (.docx or .doc) in Base64 format

Response Format

The API returns a JSON response or text file with extracted text content from the specified page range.

Get Help