{"id":"894ede81-61cc-4861-9aef-5024ba848cae","shortId":"kQHa9P","kind":"skill","title":"pdf","tagline":"Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.","description":"# PDF Processing Guide\n\n## Overview\n\nThis guide covers essential PDF processing operations using Python libraries and command-line tools. For advanced features, JavaScript libraries, and detailed examples, see reference.md. If you need to fill out a PDF form, read forms.md and follow its instructions.\n\n## Quick Start\n\n```python\nfrom pypdf import PdfReader, PdfWriter\n\n# Read a PDF\nreader = PdfReader(\"document.pdf\")\nprint(f\"Pages: {len(reader.pages)}\")\n\n# Extract text\ntext = \"\"\nfor page in reader.pages:\n    text += page.extract_text()\n```\n\n## Python Libraries\n\n### pypdf - Basic Operations\n\n#### Merge PDFs\n```python\nfrom pypdf import PdfWriter, PdfReader\n\nwriter = PdfWriter()\nfor pdf_file in [\"doc1.pdf\", \"doc2.pdf\", \"doc3.pdf\"]:\n    reader = PdfReader(pdf_file)\n    for page in reader.pages:\n        writer.add_page(page)\n\nwith open(\"merged.pdf\", \"wb\") as output:\n    writer.write(output)\n```\n\n#### Split PDF\n```python\nreader = PdfReader(\"input.pdf\")\nfor i, page in enumerate(reader.pages):\n    writer = PdfWriter()\n    writer.add_page(page)\n    with open(f\"page_{i+1}.pdf\", \"wb\") as output:\n        writer.write(output)\n```\n\n#### Extract Metadata\n```python\nreader = PdfReader(\"document.pdf\")\nmeta = reader.metadata\nprint(f\"Title: {meta.title}\")\nprint(f\"Author: {meta.author}\")\nprint(f\"Subject: {meta.subject}\")\nprint(f\"Creator: {meta.creator}\")\n```\n\n#### Rotate Pages\n```python\nreader = PdfReader(\"input.pdf\")\nwriter = PdfWriter()\n\npage = reader.pages[0]\npage.rotate(90)  # Rotate 90 degrees clockwise\nwriter.add_page(page)\n\nwith open(\"rotated.pdf\", \"wb\") as output:\n    writer.write(output)\n```\n\n### pdfplumber - Text and Table Extraction\n\n#### Extract Text with Layout\n```python\nimport pdfplumber\n\nwith pdfplumber.open(\"document.pdf\") as pdf:\n    for page in pdf.pages:\n        text = page.extract_text()\n        print(text)\n```\n\n#### Extract Tables\n```python\nwith pdfplumber.open(\"document.pdf\") as pdf:\n    for i, page in enumerate(pdf.pages):\n        tables = page.extract_tables()\n        for j, table in enumerate(tables):\n            print(f\"Table {j+1} on page {i+1}:\")\n            for row in table:\n                print(row)\n```\n\n#### Advanced Table Extraction\n```python\nimport pandas as pd\n\nwith pdfplumber.open(\"document.pdf\") as pdf:\n    all_tables = []\n    for page in pdf.pages:\n        tables = page.extract_tables()\n        for table in tables:\n            if table:  # Check if table is not empty\n                df = pd.DataFrame(table[1:], columns=table[0])\n                all_tables.append(df)\n\n# Combine all tables\nif all_tables:\n    combined_df = pd.concat(all_tables, ignore_index=True)\n    combined_df.to_excel(\"extracted_tables.xlsx\", index=False)\n```\n\n### reportlab - Create PDFs\n\n#### Basic PDF Creation\n```python\nfrom reportlab.lib.pagesizes import letter\nfrom reportlab.pdfgen import canvas\n\nc = canvas.Canvas(\"hello.pdf\", pagesize=letter)\nwidth, height = letter\n\n# Add text\nc.drawString(100, height - 100, \"Hello World!\")\nc.drawString(100, height - 120, \"This is a PDF created with reportlab\")\n\n# Add a line\nc.line(100, height - 140, 400, height - 140)\n\n# Save\nc.save()\n```\n\n#### Create PDF with Multiple Pages\n```python\nfrom reportlab.lib.pagesizes import letter\nfrom reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak\nfrom reportlab.lib.styles import getSampleStyleSheet\n\ndoc = SimpleDocTemplate(\"report.pdf\", pagesize=letter)\nstyles = getSampleStyleSheet()\nstory = []\n\n# Add content\ntitle = Paragraph(\"Report Title\", styles['Title'])\nstory.append(title)\nstory.append(Spacer(1, 12))\n\nbody = Paragraph(\"This is the body of the report. \" * 20, styles['Normal'])\nstory.append(body)\nstory.append(PageBreak())\n\n# Page 2\nstory.append(Paragraph(\"Page 2\", styles['Heading1']))\nstory.append(Paragraph(\"Content for page 2\", styles['Normal']))\n\n# Build PDF\ndoc.build(story)\n```\n\n## Command-Line Tools\n\n### pdftotext (poppler-utils)\n```bash\n# Extract text\npdftotext input.pdf output.txt\n\n# Extract text preserving layout\npdftotext -layout input.pdf output.txt\n\n# Extract specific pages\npdftotext -f 1 -l 5 input.pdf output.txt  # Pages 1-5\n```\n\n### qpdf\n```bash\n# Merge PDFs\nqpdf --empty --pages file1.pdf file2.pdf -- merged.pdf\n\n# Split pages\nqpdf input.pdf --pages . 1-5 -- pages1-5.pdf\nqpdf input.pdf --pages . 6-10 -- pages6-10.pdf\n\n# Rotate pages\nqpdf input.pdf output.pdf --rotate=+90:1  # Rotate page 1 by 90 degrees\n\n# Remove password\nqpdf --password=mypassword --decrypt encrypted.pdf decrypted.pdf\n```\n\n### pdftk (if available)\n```bash\n# Merge\npdftk file1.pdf file2.pdf cat output merged.pdf\n\n# Split\npdftk input.pdf burst\n\n# Rotate\npdftk input.pdf rotate 1east output rotated.pdf\n```\n\n## Common Tasks\n\n### Extract Text from Scanned PDFs\n```python\n# Requires: pip install pytesseract pdf2image\nimport pytesseract\nfrom pdf2image import convert_from_path\n\n# Convert PDF to images\nimages = convert_from_path('scanned.pdf')\n\n# OCR each page\ntext = \"\"\nfor i, image in enumerate(images):\n    text += f\"Page {i+1}:\\n\"\n    text += pytesseract.image_to_string(image)\n    text += \"\\n\\n\"\n\nprint(text)\n```\n\n### Add Watermark\n```python\nfrom pypdf import PdfReader, PdfWriter\n\n# Create watermark (or load existing)\nwatermark = PdfReader(\"watermark.pdf\").pages[0]\n\n# Apply to all pages\nreader = PdfReader(\"document.pdf\")\nwriter = PdfWriter()\n\nfor page in reader.pages:\n    page.merge_page(watermark)\n    writer.add_page(page)\n\nwith open(\"watermarked.pdf\", \"wb\") as output:\n    writer.write(output)\n```\n\n### Extract Images\n```bash\n# Using pdfimages (poppler-utils)\npdfimages -j input.pdf output_prefix\n\n# This extracts all images as output_prefix-000.jpg, output_prefix-001.jpg, etc.\n```\n\n### Password Protection\n```python\nfrom pypdf import PdfReader, PdfWriter\n\nreader = PdfReader(\"input.pdf\")\nwriter = PdfWriter()\n\nfor page in reader.pages:\n    writer.add_page(page)\n\n# Add password\nwriter.encrypt(\"userpassword\", \"ownerpassword\")\n\nwith open(\"encrypted.pdf\", \"wb\") as output:\n    writer.write(output)\n```\n\n## Quick Reference\n\n| Task | Best Tool | Command/Code |\n|------|-----------|--------------|\n| Merge PDFs | pypdf | `writer.add_page(page)` |\n| Split PDFs | pypdf | One page per file |\n| Extract text | pdfplumber | `page.extract_text()` |\n| Extract tables | pdfplumber | `page.extract_tables()` |\n| Create PDFs | reportlab | Canvas or Platypus |\n| Command line merge | qpdf | `qpdf --empty --pages ...` |\n| OCR scanned PDFs | pytesseract | Convert to image first |\n| Fill PDF forms | pdf-lib or pypdf (see forms.md) | See forms.md |\n\n## Next Steps\n\n- For advanced pypdfium2 usage, see reference.md\n- For JavaScript libraries (pdf-lib), see reference.md\n- If you need to fill out a PDF form, follow the instructions in forms.md\n- For troubleshooting guides, see reference.md","tags":["pdf","coco","rkz91","agent-skills","agents-md","ai-agents","claude-code","codex","cursor","developer-tools","llm-tools","mcp"],"capabilities":["skill","source-rkz91","skill-pdf","topic-agent-skills","topic-agents-md","topic-ai-agents","topic-claude-code","topic-codex","topic-cursor","topic-developer-tools","topic-llm-tools","topic-mcp","topic-pm-tools","topic-product-management","topic-productivity"],"categories":["coco"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/rkz91/coco/pdf","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add rkz91/coco","source_repo":"https://github.com/rkz91/coco","install_from":"skills.sh"}},"qualityScore":"0.453","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 7 github stars · SKILL.md body (6,729 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:14:08.223Z","embedding":null,"createdAt":"2026-05-18T13:21:41.049Z","updatedAt":"2026-05-18T19:14:08.223Z","lastSeenAt":"2026-05-18T19:14:08.223Z","tsv":"'+1':174,286,290,639 '+90':557 '-10':549 '-5':526,543 '0':215,337,668 '1':334,454,519,525,542,558,561 '100':385,387,391,405 '12':455 '120':393 '140':407,410 '1east':592 '2':473,477,485 '20':465 '400':408 '5':521 '6':548 '90':217,219,563 'add':382,401,442,651,737 'advanc':58,297,815 'all_tables.append':338 'analyz':33 'appli':669 'author':195 'avail':575 'bash':500,528,576,698 'basic':114,362 'best':753 'bodi':456,461,469 'build':488 'burst':587 'c':374 'c.drawstring':384,390 'c.line':404 'c.save':412 'canva':373,782 'canvas.canvas':375 'cat':581 'check':325 'claud':20 'clockwis':221 'column':335 'combin':340,346 'combined_df.to':354 'command':54,493,785 'command-lin':53,492 'command/code':755 'common':595 'comprehens':2 'content':443,482 'convert':613,616,621,796 'cover':44 'creat':11,360,398,413,659,779 'creation':364 'creator':203 'decrypt':570 'decrypted.pdf':572 'degre':220,564 'detail':63 'df':331,339,347 'doc':434 'doc.build':490 'doc1.pdf':130 'doc2.pdf':131 'doc3.pdf':132 'document':15,35 'document.pdf':95,186,247,264,307,675 'empti':330,532,790 'encrypted.pdf':571,744 'enumer':162,271,280,633 'essenti':45 'etc':716 'exampl':64 'excel':355 'exist':663 'extract':7,101,181,237,238,259,299,501,506,514,597,696,710,769,774 'extracted_tables.xlsx':356 'f':97,171,190,194,198,202,283,518,636 'fals':358 'featur':59 'file':128,136,768 'file1.pdf':534,579 'file2.pdf':535,580 'fill':23,71,800,832 'first':799 'follow':79,837 'form':18,27,75,802,836 'forms.md':77,809,811,841 'generat':31 'getsamplestylesheet':433,440 'guid':40,43,844 'handl':17 'heading1':479 'height':380,386,392,406,409 'hello':388 'hello.pdf':376 'ignor':351 'imag':619,620,631,634,645,697,712,798 'import':87,121,243,301,368,372,421,425,432,608,612,656,722 'index':352,357 'input.pdf':157,210,504,512,522,540,546,554,586,590,706,727 'instal':605 'instruct':81,839 'j':277,285,705 'javascript':60,821 'l':520 'layout':241,509,511 'len':99 'letter':369,378,381,422,438 'lib':805,825 'librari':51,61,112,822 'line':55,403,494,786 'load':662 'manipul':4 'merg':116,529,577,756,787 'merged.pdf':146,536,583 'merging/splitting':14 'meta':187 'meta.author':196 'meta.creator':204 'meta.subject':200 'meta.title':192 'metadata':182 'multipl':416 'mypassword':569 'n':640,647,648 'need':21,69,830 'new':12 'next':812 'normal':467,487 'ocr':625,792 'one':765 'open':145,170,226,689,743 'oper':48,115 'output':149,151,178,180,230,232,582,593,693,695,707,747,749 'output.pdf':555 'output.txt':505,513,523 'output_prefix-000.jpg':714 'output_prefix-001.jpg':715 'overview':41 'ownerpassword':741 'page':98,105,138,142,143,160,167,168,172,206,213,223,224,251,269,288,313,417,472,476,484,516,524,533,538,541,547,552,560,627,637,667,672,679,683,686,687,731,735,736,760,761,766,791 'page.extract':109,255,274,317,772,777 'page.merge':682 'page.rotate':216 'pagebreak':429,471 'pages':377,437 'pages1-5.pdf':544 'pages6-10.pdf':550 'panda':302 'paragraph':427,445,457,475,481 'password':566,568,717,738 'path':615,623 'pd':304 'pd.concat':348 'pd.dataframe':332 'pdf':1,3,26,34,38,46,74,92,127,135,153,175,249,266,309,363,397,414,489,617,801,804,824,835 'pdf-lib':803,823 'pdf.pages':253,272,315 'pdf2image':607,611 'pdfimag':700,704 'pdfplumber':233,244,771,776 'pdfplumber.open':246,263,306 'pdfreader':88,94,123,134,156,185,209,657,665,674,723,726 'pdfs':13,117,361,530,601,757,763,780,794 'pdftk':573,578,585,589 'pdftotext':496,503,510,517 'pdfwriter':89,122,125,165,212,658,677,724,729 'per':767 'pip':604 'platypus':784 'poppler':498,702 'poppler-util':497,701 'prefix':708 'preserv':508 'print':96,189,193,197,201,257,282,295,649 'process':30,39,47 'programmat':29 'protect':718 'pypdf':86,113,120,655,721,758,764,807 'pypdfium2':816 'pytesseract':606,609,795 'pytesseract.image':642 'python':50,84,111,118,154,183,207,242,261,300,365,418,602,653,719 'qpdf':527,531,539,545,553,567,788,789 'quick':82,750 'read':76,90 'reader':93,133,155,184,208,673,725 'reader.metadata':188 'reader.pages':100,107,140,163,214,681,733 'refer':751 'reference.md':66,819,827,846 'remov':565 'report':446,464 'report.pdf':436 'reportlab':359,400,781 'reportlab.lib.pagesizes':367,420 'reportlab.lib.styles':431 'reportlab.pdfgen':371 'reportlab.platypus':424 'requir':603 'rotat':205,218,551,556,559,588,591 'rotated.pdf':227,594 'row':292,296 'save':411 'scale':37 'scan':600,793 'scanned.pdf':624 'see':65,808,810,818,826,845 'simpledoctempl':426,435 'skill' 'skill-pdf' 'source-rkz91' 'spacer':428,453 'specif':515 'split':152,537,584,762 'start':83 'step':813 'stori':441,491 'story.append':450,452,468,470,474,480 'string':644 'style':439,448,466,478,486 'subject':199 'tabl':10,236,260,273,275,278,281,284,294,298,311,316,318,320,322,324,327,333,336,342,345,350,775,778 'task':596,752 'text':8,102,103,108,110,234,239,254,256,258,383,502,507,598,628,635,641,646,650,770,773 'titl':191,444,447,449,451 'tool':56,495,754 'toolkit':5 'topic-agent-skills' 'topic-agents-md' 'topic-ai-agents' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-developer-tools' 'topic-llm-tools' 'topic-mcp' 'topic-pm-tools' 'topic-product-management' 'topic-productivity' 'troubleshoot':843 'true':353 'usag':817 'use':49,699 'userpassword':740 'util':499,703 'watermark':652,660,664,684 'watermark.pdf':666 'watermarked.pdf':690 'wb':147,176,228,691,745 'width':379 'world':389 'writer':124,164,211,676,728 'writer.add':141,166,222,685,734,759 'writer.encrypt':739 'writer.write':150,179,231,694,748","prices":[{"id":"98a787be-e54a-44b9-ad4c-cb4d557dff25","listingId":"894ede81-61cc-4861-9aef-5024ba848cae","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"rkz91","category":"coco","install_from":"skills.sh"},"createdAt":"2026-05-18T13:21:41.049Z"}],"sources":[{"listingId":"894ede81-61cc-4861-9aef-5024ba848cae","source":"github","sourceId":"rkz91/coco/pdf","sourceUrl":"https://github.com/rkz91/coco/tree/main/skills/pdf","isPrimary":false,"firstSeenAt":"2026-05-18T13:21:41.049Z","lastSeenAt":"2026-05-18T19:14:08.223Z"}],"details":{"listingId":"894ede81-61cc-4861-9aef-5024ba848cae","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"rkz91","slug":"pdf","github":{"repo":"rkz91/coco","stars":7,"topics":["agent-skills","agents-md","ai","ai-agents","claude-code","codex","cursor","developer-tools","llm-tools","mcp","pm-tools","product-management","productivity","prompt-engineering","workflow-automation"],"license":"mit","html_url":"https://github.com/rkz91/coco","pushed_at":"2026-04-26T01:51:27Z","description":"Open-source library of AI superpowers — 59 skills, 34 commands, 10 agents + 24 GSD subagents, 3 system bundles. An entire team, wherever your AI lives. Vendor-neutral across Claude Code, Cursor, Codex, and any AGENTS.md tool.","skill_md_sha":"f6a22ddf88fdc7e7b7603f4c9064cc51bd930ad9","skill_md_path":"skills/pdf/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/rkz91/coco/tree/main/skills/pdf"},"layout":"multi","source":"github","category":"coco","frontmatter":{"name":"pdf","license":"Proprietary. LICENSE.txt has complete terms","description":"Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale."},"skills_sh_url":"https://skills.sh/rkz91/coco/pdf"},"updatedAt":"2026-05-18T19:14:08.223Z"}}