Skills
Email & Tools Official

PDF Reader

Add PDF reading to the container agent. Extracts text from attachments, URLs, and local files using pdftotext.

What it does

  • Extracts text from PDFs using pdftotext (poppler-utils)
  • Auto-downloads PDFs sent as WhatsApp attachments
  • Agent can fetch and read PDFs from URLs
  • Reads local PDF files from mounted directories
  • Includes page count and metadata via pdfinfo

What you'll need

  • NanoClaw installed and running
  • WhatsApp channel configured (for attachment support)

Install

/add-pdf-reader

How it works

The /add-pdf-reader skill gives the container agent the ability to read PDF documents. It installs poppler-utils (which provides pdftotext and pdfinfo) inside the container and adds a pdf-reader CLI tool that the agent can call.

The skill handles three scenarios:

WhatsApp attachments. When someone sends a PDF in a registered chat, NanoClaw detects the document attachment, downloads it to the group’s attachments/ directory, and makes it available to the agent. The agent can then extract and discuss the contents.

URLs. The agent can fetch a PDF from any URL using pdf-reader fetch <url>. The file is downloaded to the container’s working directory and extracted in one step.

Local files. If a PDF exists in a mounted directory, the agent reads it directly with pdf-reader. This is useful for processing documents that are already on your machine.

What the agent extracts

The pdftotext tool extracts text content from PDFs while preserving basic layout. It handles multi-column layouts, headers, footers, and embedded fonts. The pdfinfo companion tool provides metadata: page count, title, author, creation date, and file size.

The agent typically reads the full text and then responds based on what you asked — summarize it, answer questions about it, or pull out specific information.

Limitations

The tool only works with text-based PDFs. Scanned documents (where pages are images) produce empty or garbled output from pdftotext. For scanned PDFs, the agent can use the browser tool to open the PDF visually instead, though this is less reliable for long documents.

Password-protected PDFs that require a password to open are not supported. PDFs with copy-protection (no-extract flag) are handled by pdftotext but may have degraded output.

Troubleshooting

Agent says pdf-reader command not found. The container needs rebuilding. Run ./container/build.sh and restart the service.

PDF text extraction is empty. The PDF is likely scanned (image-based). pdftotext only handles text-based PDFs. Try asking the agent to open it with the browser tool instead.

WhatsApp PDF not detected. The message must have documentMessage with mimetype: application/pdf. Some apps send PDFs as generic file attachments without the correct MIME type, which the channel adapter won’t recognize as a PDF.

Tips

  • The agent knows to use pdf-reader automatically when it encounters a PDF. You don’t need to tell it which tool to use.
  • For very long PDFs, the agent extracts all text but may summarize rather than quoting the entire document. Ask specific questions to get precise answers from the content.
  • The pdf-reader tool runs inside the container, so it can only access files in mounted directories or files the agent downloads itself.