Skip to content
Jason's Blog
Go back

Installing Anthropic Document Processing Skills in OpenClaw

Installing Anthropic Document Processing Skills in OpenClaw

The Anthropic skills repository provides high-quality document processing skills for PDF, PPTX, XLSX, and DOCX. These are knowledge-based skills (SKILL.md guides + helper scripts) that teach AI agents best practices for document handling. They do not require an Anthropic API key — they use standard open-source libraries.

What You Get

SkillCapabilities
PDFExtract text/tables (pdfplumber), merge/split/rotate (pypdf), create PDFs (reportlab), OCR scanned docs (tesseract), fill forms
PPTXRead/extract text (markitdown), create slides (pptxgenjs), edit XML directly, convert to PDF/images (LibreOffice)
XLSXCreate/edit spreadsheets (openpyxl), data analysis (pandas), formula recalculation (LibreOffice)
DOCXCreate documents (docx npm package), read with pandoc, edit XML directly, handle tracked changes

Step 1: Install System Dependencies

# System tools
sudo apt install -y poppler-utils qpdf tesseract-ocr libreoffice pandoc imagemagick

# Python libraries
pip install pypdf pdfplumber reportlab pytesseract pdf2image openpyxl pandas "markitdown[pptx]" Pillow

# Node.js packages
npm install -g pptxgenjs docx

Dependency Matrix

DependencyUsed ByPurpose
poppler-utilsPDF, PPTXpdftotext, pdfimages, pdftoppm
qpdfPDFPDF linearization, repair
tesseract-ocrPDFOCR for scanned documents
libreofficePPTX, XLSX, DOCXFormat conversion, formula recalc, accept tracked changes
pandocDOCXRead content, extract tracked changes
imagemagickPDFImage processing
pypdfPDFMerge, split, rotate, encrypt
pdfplumberPDFText and table extraction
reportlabPDFCreate new PDFs
openpyxlXLSXRead/write Excel with formatting
pandasXLSXData analysis
pptxgenjs (npm)PPTXCreate presentations from scratch
docx (npm)DOCXCreate Word documents from scratch

Step 2: Clone and Copy Skills

cd ~/.openclaw/workspace

# Clone the repo
git clone https://github.com/anthropics/skills.git anthropic-skills

# Create skill directories and copy
mkdir -p skills/pdf skills/pptx skills/xlsx skills/docx
cp -r anthropic-skills/skills/pdf/* skills/pdf/
cp -r anthropic-skills/skills/pptx/* skills/pptx/
cp -r anthropic-skills/skills/xlsx/* skills/xlsx/
cp -r anthropic-skills/skills/docx/* skills/docx/

# Clean up — remove the cloned repo
rm -rf anthropic-skills

Each skill directory should contain at minimum a SKILL.md and a scripts/ folder.

Step 3: Verify

openclaw skills list

You should see all four skills with status ✓ ready:

│ ✓ ready │ 📦 pdf  │ ...  │ openclaw-workspace │
│ ✓ ready │ 📦 pptx │ ...  │ openclaw-workspace │
│ ✓ ready │ 📦 xlsx │ ...  │ openclaw-workspace │
│ ✓ ready │ 📦 docx │ ...  │ openclaw-workspace │

Step 4: Restart OpenClaw

openclaw gateway restart

How It Works

These skills are not standalone tools — they are agent knowledge files. When your OpenClaw agent encounters a document task, it:

  1. Reads the relevant SKILL.md for best practices and tool selection
  2. Executes Python/Node.js commands via exec using the installed libraries
  3. Follows the quality assurance steps defined in the skill

For example, when asked to “extract tables from a PDF”:

Agent reads: skills/pdf/SKILL.md
Agent runs:  python3 -c "import pdfplumber; ..."
Agent returns: extracted table data

Updating Skills

To pull the latest from Anthropic:

cd ~/.openclaw/workspace
git clone https://github.com/anthropics/skills.git anthropic-skills
cp -r anthropic-skills/skills/pdf/* skills/pdf/
cp -r anthropic-skills/skills/pptx/* skills/pptx/
cp -r anthropic-skills/skills/xlsx/* skills/xlsx/
cp -r anthropic-skills/skills/docx/* skills/docx/
rm -rf anthropic-skills

Notes


Share this post on:

Previous Post
Setting Up Voice Message Transcription in OpenClaw (Azure OpenAI Whisper)
Next Post
Setting Up GitHub Copilot Models in OpenClaw