返回 Skill 列表
extension
分类: 其它无需 API Key

aiparse-ocr

使用 LLM 解析 PDF 文件,**无需注册,免费试用**,从 PDF 中提取信息并以 JSON 或 Markdown 格式返回结果...

person作者: do0388309hubclawhub

AI Parse

A skill for parsing PDF files using Large Language Models.

Capabilities

  • Extract information from PDF files
  • Return results in JSON or Markdown format
  • Resume processing from existing task ID
  • Save task ID information to JSON file for reference

Parameters

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | pdf_path | string | required | Path to the PDF file to process | | result_path | string | required | Path to save the parsing result | | format | string | required | Output format: "json" or "md" | | task_id_path | string | required | Path to save task ID information (JSON format) | | --task-id | string | optional | Existing task ID to resume processing |

Usage Examples

Normal Upload Mode

python handler.py <pdf_path> <result_path> <format> <task_id_path>

Resume from Existing Task or Check Status

python handler.py --task-id <task_id> <result_path> <format>

Task ID File Format

When using normal upload mode, a task ID file will be created at task_id_path with the following JSON structure:

{
  "task_id": "AAFXKO",
  "pdf_path": "test.pdf",
  "submit_time": "2026-04-04 00:33:27"
}

This file can be used to:

  • Track the submitted task
  • Retrieve the task ID later for status checking
  • Resume processing if interrupted

Implementation

Implemented by handler.py which:

  • Uploads PDF files to the processing service
  • Polls for processing completion
  • Downloads and saves results in the requested format
  • Supports resuming from existing task IDs
  • Saves task ID information to JSON file

Environment Requirements

  • Python 3.6+
  • requests library

Return Value

The parsed result will be saved to the specified result_path in the requested format:

  • JSON format: Structured JSON with task details and extracted content
  • Markdown format: Formatted Markdown with page-by-page content

Notes

  • For large PDF files, processing may take multiple minutes
  • Free users can process 30 PDF pages - visit https://api.pinocch.com/index for extra trial credits
  • The --task-id parameter can be used to resume processing if interrupted
  • Check the console output for processing progress and status updates
  • The task ID file is created immediately after successful upload
  • IMPORTANT FOR AGENTS: Before declaring a task as failed, always use the task ID to check the current status of the task. Use the --task-id parameter to resume or verify the task status. The task may still be processing or have completed successfully.