Skip to main content

Classify Document

PDF4me Classify Document enables you to classify and identify documents based on their content. This API service processes PDF files and analyzes document structure, content patterns, and metadata to determine document types and categories. The API receives PDF content through REST API calls, utilizing Base64 encoding for secure transmission. With support for document type identification and automated categorization, this solution is ideal for document management systems and automated workflows.

Authenticating Your API Request

To access the PDF4me REST API, every request must include proper authentication credentials. Authentication ensures secure communication and validates your identity as an authorized user of the REST API.

Key Features

  • Document Classification: Identify and classify document types based on content analysis
  • Content Analysis: Analyze document structure, content patterns, and metadata
  • Multiple Document Types: Support for various PDF document formats
  • Automated Processing: Streamlined document classification without manual intervention
  • Simple API Integration: RESTful API designed for automated document processing workflows

REST API Endpoint

The PDF4me REST API uses standard HTTP methods to interact with resources. All document classification operations are performed through a single endpoint:

  • Method: POST
  • Endpoint: /api/v2/ClassifyDocument

REST API Parameters

Complete list of parameters for the Classify Document REST API. Parameters are organized by category for better understanding and implementation.

Important: Parameters marked with an asterisk (*) are required and must be provided for the API to function correctly.

Required Parameters

ParameterTypeDescriptionExample
docContent*Base64 (String)The content of the input PDF file encoded in Base64 format for document analysis and classification processingJVBERi...
docName*StringSource PDF file name with proper .pdf extension for document identification and processingoutput.pdf

Optional Parameters

ParameterTypeDescriptionExample
asyncBooleanEnable asynchronous processing. When true, the API returns 202 Accepted with a Location header for polling the resulttrue

Output

The PDF4me Classify Document REST API returns different responses based on the processing mode. The API returns classification data as a JSON response.

Synchronous Processing (Default)

When async is false or not provided, the API returns the classification results immediately.

Status Code: 200 OK

Response Format:

{
"documentType": "invoice",
"category": "financial",
"confidence": 0.95,
"metadata": {
"pageCount": 1,
"createdDate": "2024-01-15T10:30:00Z"
}
}

The response contains classification data including document type, category, confidence score, and metadata information.

Request Example

Content-Type: application/json
Authorization: Basic YOUR_BASE64_ENCODED_API_KEY

Note:

  • Get your API key from the PDF4me Dashboard
  • The API key must be Base64 encoded and prefixed with "Basic " in the Authorization header
  • Example: If your API key is abc123, encode it to Base64 and use Authorization: Basic YWJjMTIz

Payload

Basic Request:

{
"docContent": "JVBERi0xLjQKJeLjz9MKMSAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFnZXMgMiAwIFIKPj4KZW5kb2JqCjIgMCBvYmoKPDwKL1R5cGUgL1BhZ2VzCi9LaWRzIFszIDAgUl0KL0NvdW50IDEKPD4KZW5kb2JqCjMgMCBvYmoKPDwKL1R5cGUgL1BhZ2UKL1BhcmVudCAyIDAgUgovTWVkaWFCb3ggWzAgMCA2MTIgNzkyXQovUmVzb3VyY2VzIDw8Ci9Gb250IDw8Ci9GMSA0IDAgUgo+Pgo+PgovQ29udGVudHMgNSAwIFIKPj4KZW5kb2JqCjQgMCBvYmoKPDwKL1R5cGUgL0ZvbnQKL1N1YnR5cGUgL1R5cGUxCi9CYXNlRm9udCAvSGVsdmV0aWNhCj4+CmVuZG9iago1IDAgb2JqCjw8Ci9MZW5ndGggNDQKPj4Kc3RyZWFtCkJUCi9GMSAxMiBUZgoxMDAgNzAwIFRkCihIZWxsbyBXb3JsZCkgVGoKRVQKZW5kc3RyZWFtCmVuZG9iagp4cmVmCjAgNgowMDAwMDAwMDAwIDY1NTM1IGYgCjAwMDAwMDAwMDkgMDAwMDAgbiAKMDAwMDAwMDA1NCAwMDAwMCBuIAowMDAwMDAwMTAxIDAwMDAwIG4gCjAwMDAwMDAxNzAgMDAwMDAgbiAKMDAwMDAwMDI0NCAwMDAwMCBuIAp0cmFpbGVyCjw8Ci9TaXplIDYKL1Jvb3QgMSAwIFIKPj4Kc3RhcnR4cmVmCjM0MQolJUVPRg==",
"docName": "output.pdf"
}

With Asynchronous Processing:

{
"docContent": "JVBERi0xLjQKJeLjz9MK...",
"docName": "output.pdf",
"async": true
}

Code Samples

The PDF4me Classify Document REST API provides code samples in multiple programming languages. Choose the language that best fits your development environment:

C# (CSharp) Sample

Complete C# implementation for document classification:

Document Classification Features

Classification Capabilities

  • Document Type Detection: Identification of document types based on content analysis
  • Content Categorization: Grouping of documents by content and purpose
  • Template Recognition: Identification of document templates and standard formats
  • Compliance Classification: Classification for regulatory and compliance requirements

Enterprise Features

  • Batch Processing: Process multiple documents with consistent classification standards
  • Quality Assurance: Ensure classification accuracy and reliability for business applications
  • Integration Ready: Seamless integration with existing document management systems
  • Scalable Processing: Handle large volumes of documents with enterprise-grade performance

Industry Use Cases & Applications

Document Management & Organization Use Cases

  • Automated Filing: Automatically classify and organize documents in digital filing systems
  • Content Management: Advanced categorization for content management and retrieval systems
  • Archive Organization: Systematic classification of historical documents and records
  • Search Optimization: Enhanced document search through accurate classification and tagging

Get Help