PDF Translator Skill

PDFs are still the dominant carrier for papers and technical specs, but they’re hostile to downstream workflows: locked layout, mixed encodings, no easy way to post-process translated output. This skill closes that gap by letting Claude extract text from a PDF, translate it into a target language, and write back a clean Markdown file you can keep iterating on.

Structure

SKILL.md: The main definition file for the skill.
requirements.txt: Python dependencies.
references/: Reference documentation.
- api_guide.md: API usage guide and examples.
scripts/: Helper scripts.
- extract_text.py: Extracts text from a PDF file using PyPDF2.
- generate_md.py: (Optional) Helper to save translated content with a metadata header.
- create_test_pdf.py: Utility to generate a sample PDF for testing.
test_sample.pdf: Sample PDF for testing purposes.
test_output.md: Example output of a translated PDF.

Setup

Ensure you have Python 3 installed.
Install dependencies:
```
pip install -r requirements.txt
```

Usage

You can ask Claude to translate a PDF file naturally.

Example: “Translate the file documents/paper.pdf to Spanish.”

Claude will:

Read the PDF using extract_text.py.
Translate the content.
Save it as documents/paper_translated.md.