{"id":"7e71c001-fc27-4289-952e-046bf1415afc","shortId":"fAYcCY","kind":"skill","title":"docx","tagline":"Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When Claude needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, ","description":"# DOCX creation, editing, and analysis\n\n## Overview\n\nA user may ask you to create, edit, or analyze the contents of a .docx file. A .docx file is essentially a ZIP archive containing XML files and other resources that you can read or edit. You have different tools and workflows available for different tasks.\n\n## Workflow Decision Tree\n\n### Reading/Analyzing Content\nUse \"Text extraction\" or \"Raw XML access\" sections below\n\n### Creating New Document\nUse \"Creating a new Word document\" workflow\n\n### Editing Existing Document\n- **Your own document + simple changes**\n  Use \"Basic OOXML editing\" workflow\n\n- **Someone else's document**\n  Use **\"Redlining workflow\"** (recommended default)\n\n- **Legal, academic, business, or government docs**\n  Use **\"Redlining workflow\"** (required)\n\n## Reading and analyzing content\n\n### Text extraction\nIf you just need to read the text contents of a document, you should convert the document to markdown using pandoc. Pandoc provides excellent support for preserving document structure and can show tracked changes:\n\n```bash\n# Convert document to markdown with tracked changes\npandoc --track-changes=all path-to-file.docx -o output.md\n# Options: --track-changes=accept/reject/all\n```\n\n### Raw XML access\nYou need raw XML access for: comments, complex formatting, document structure, embedded media, and metadata. For any of these features, you'll need to unpack a document and read its raw XML contents.\n\n#### Unpacking a file\n`python ooxml/scripts/unpack.py <office_file> <output_directory>`\n\n#### Key file structures\n* `word/document.xml` - Main document contents\n* `word/comments.xml` - Comments referenced in document.xml\n* `word/media/` - Embedded images and media files\n* Tracked changes use `<w:ins>` (insertions) and `<w:del>` (deletions) tags\n\n## Creating a new Word document\n\nWhen creating a new Word document from scratch, use **docx-js**, which allows you to create Word documents using JavaScript/TypeScript.\n\n### Workflow\n1. **MANDATORY - READ ENTIRE FILE**: Read [`docx-js.md`](docx-js.md) (~500 lines) completely from start to finish. **NEVER set any range limits when reading this file.** Read the full file content for detailed syntax, critical formatting rules, and best practices before proceeding with document creation.\n2. Create a JavaScript/TypeScript file using Document, Paragraph, TextRun components (You can assume all dependencies are installed, but if not, refer to the dependencies section below)\n3. Export as .docx using Packer.toBuffer()\n\n## Editing an existing Word document\n\nWhen editing an existing Word document, use the **Document library** (a Python library for OOXML manipulation). The library automatically handles infrastructure setup and provides methods for document manipulation. For complex scenarios, you can access the underlying DOM directly through the library.\n\n### Workflow\n1. **MANDATORY - READ ENTIRE FILE**: Read [`ooxml.md`](ooxml.md) (~600 lines) completely from start to finish. **NEVER set any range limits when reading this file.** Read the full file content for the Document library API and XML patterns for directly editing document files.\n2. Unpack the document: `python ooxml/scripts/unpack.py <office_file> <output_directory>`\n3. Create and run a Python script using the Document library (see \"Document Library\" section in ooxml.md)\n4. Pack the final document: `python ooxml/scripts/pack.py <input_directory> <office_file>`\n\nThe Document library provides both high-level methods for common operations and direct DOM access for complex scenarios.\n\n## Redlining workflow for document review\n\nThis workflow allows you to plan comprehensive tracked changes using markdown before implementing them in OOXML. **CRITICAL**: For complete tracked changes, you must implement ALL changes systematically.\n\n**Batching Strategy**: Group related changes into batches of 3-10 changes. This makes debugging manageable while maintaining efficiency. Test each batch before moving to the next.\n\n**Principle: Minimal, Precise Edits**\nWhen implementing tracked changes, only mark text that actually changes. Repeating unchanged text makes edits harder to review and appears unprofessional. Break replacements into: [unchanged text] + [deletion] + [insertion] + [unchanged text]. Preserve the original run's RSID for unchanged text by extracting the `<w:r>` element from the original and reusing it.\n\nExample - Changing \"30 days\" to \"60 days\" in a sentence:\n```python\n# BAD - Replaces entire sentence\n'<w:del><w:r><w:delText>The term is 30 days.</w:delText></w:r></w:del><w:ins><w:r><w:t>The term is 60 days.</w:t></w:r></w:ins>'\n\n# GOOD - Only marks what changed, preserves original <w:r> for unchanged text\n'<w:r w:rsidR=\"00AB12CD\"><w:t>The term is </w:t></w:r><w:del><w:r><w:delText>30</w:delText></w:r></w:del><w:ins><w:r><w:t>60</w:t></w:r></w:ins><w:r w:rsidR=\"00AB12CD\"><w:t> days.</w:t></w:r>'\n```\n\n### Tracked changes workflow\n\n1. **Get markdown representation**: Convert document to markdown with tracked changes preserved:\n   ```bash\n   pandoc --track-changes=all path-to-file.docx -o current.md\n   ```\n\n2. **Identify and group changes**: Review the document and identify ALL changes needed, organizing them into logical batches:\n\n   **Location methods** (for finding changes in XML):\n   - Section/heading numbers (e.g., \"Section 3.2\", \"Article IV\")\n   - Paragraph identifiers if numbered\n   - Grep patterns with unique surrounding text\n   - Document structure (e.g., \"first paragraph\", \"signature block\")\n   - **DO NOT use markdown line numbers** - they don't map to XML structure\n\n   **Batch organization** (group 3-10 related changes per batch):\n   - By section: \"Batch 1: Section 2 amendments\", \"Batch 2: Section 5 updates\"\n   - By type: \"Batch 1: Date corrections\", \"Batch 2: Party name changes\"\n   - By complexity: Start with simple text replacements, then tackle complex structural changes\n   - Sequential: \"Batch 1: Pages 1-3\", \"Batch 2: Pages 4-6\"\n\n3. **Read documentation and unpack**:\n   - **MANDATORY - READ ENTIRE FILE**: Read [`ooxml.md`](ooxml.md) (~600 lines) completely from start to finish. **NEVER set any range limits when reading this file.** Pay special attention to the \"Document Library\" and \"Tracked Change Patterns\" sections.\n   - **Unpack the document**: `python ooxml/scripts/unpack.py <file.docx> <dir>`\n   - **Note the suggested RSID**: The unpack script will suggest an RSID to use for your tracked changes. Copy this RSID for use in step 4b.\n\n4. **Implement changes in batches**: Group changes logically (by section, by type, or by proximity) and implement them together in a single script. This approach:\n   - Makes debugging easier (smaller batch = easier to isolate errors)\n   - Allows incremental progress\n   - Maintains efficiency (batch size of 3-10 changes works well)\n\n   **Suggested batch groupings:**\n   - By document section (e.g., \"Section 3 changes\", \"Definitions\", \"Termination clause\")\n   - By change type (e.g., \"Date changes\", \"Party name updates\", \"Legal term replacements\")\n   - By proximity (e.g., \"Changes on pages 1-3\", \"Changes in first half of document\")\n\n   For each batch of related changes:\n\n   **a. Map text to XML**: Grep for text in `word/document.xml` to verify how text is split across `<w:r>` elements.\n\n   **b. Create and run script**: Use `get_node` to find nodes, implement changes, then `doc.save()`. See **\"Document Library\"** section in ooxml.md for patterns.\n\n   **Note**: Always grep `word/document.xml` immediately before writing a script to get current line numbers and verify text content. Line numbers change after each script run.\n\n5. **Pack the document**: After all batches are complete, convert the unpacked directory back to .docx:\n   ```bash\n   python ooxml/scripts/pack.py unpacked reviewed-document.docx\n   ```\n\n6. **Final verification**: Do a comprehensive check of the complete document:\n   - Convert final document to markdown:\n     ```bash\n     pandoc --track-changes=all reviewed-document.docx -o verification.md\n     ```\n   - Verify ALL changes were applied correctly:\n     ```bash\n     grep \"original phrase\" verification.md  # Should NOT find it\n     grep \"replacement phrase\" verification.md  # Should find it\n     ```\n   - Check that no unintended changes were introduced\n\n\n## Converting Documents to Images\n\nTo visually analyze Word documents, convert them to images using a two-step process:\n\n1. **Convert DOCX to PDF**:\n   ```bash\n   soffice --headless --convert-to pdf document.docx\n   ```\n\n2. **Convert PDF pages to JPEG images**:\n   ```bash\n   pdftoppm -jpeg -r 150 document.pdf page\n   ```\n   This creates files like `page-1.jpg`, `page-2.jpg`, etc.\n\nOptions:\n- `-r 150`: Sets resolution to 150 DPI (adjust for quality/size balance)\n- `-jpeg`: Output JPEG format (use `-png` for PNG if preferred)\n- `-f N`: First page to convert (e.g., `-f 2` starts from page 2)\n- `-l N`: Last page to convert (e.g., `-l 5` stops at page 5)\n- `page`: Prefix for output files\n\nExample for specific range:\n```bash\npdftoppm -jpeg -r 150 -f 2 -l 5 document.pdf page  # Converts only pages 2-5\n```\n\n## Code Style Guidelines\n**IMPORTANT**: When generating code for DOCX operations:\n- Write concise code\n- Avoid verbose variable names and redundant operations\n- Avoid unnecessary print statements\n\n## Dependencies\n\nRequired dependencies (install if not available):\n\n- **pandoc**: `sudo apt-get install pandoc` (for text extraction)\n- **docx**: `npm install -g docx` (for creating new documents)\n- **LibreOffice**: `sudo apt-get install libreoffice` (for PDF conversion)\n- **Poppler**: `sudo apt-get install poppler-utils` (for pdftoppm to convert PDF to images)\n- **defusedxml**: `pip install defusedxml` (for secure XML parsing)","tags":["docx","coco","rkz91","agent-skills","agents-md","ai-agents","claude-code","codex","cursor","developer-tools","llm-tools","mcp"],"capabilities":["skill","source-rkz91","skill-docx","topic-agent-skills","topic-agents-md","topic-ai-agents","topic-claude-code","topic-codex","topic-cursor","topic-developer-tools","topic-llm-tools","topic-mcp","topic-pm-tools","topic-product-management","topic-productivity"],"categories":["coco"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/rkz91/coco/docx","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add rkz91/coco","source_repo":"https://github.com/rkz91/coco","install_from":"skills.sh"}},"qualityScore":"0.453","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 7 github stars · SKILL.md body (9,699 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:14:06.704Z","embedding":null,"createdAt":"2026-05-18T13:21:38.817Z","updatedAt":"2026-05-18T19:14:06.704Z","lastSeenAt":"2026-05-18T19:14:06.704Z","tsv":"'-10':555,756,920 '-3':801,956 '-5':1235 '-6':806 '1':30,301,423,669,764,776,798,800,955,1129 '150':1153,1165,1169,1224 '2':34,344,465,690,766,769,780,803,1142,1193,1197,1226,1234 '3':370,471,554,755,807,919,932 '3.2':719 '30':627,643,663 '4':488,805,877 '4b':876 '5':771,1035,1206,1210,1228 '500':309 '6':1056 '60':630,648,664 '600':431,819 'academ':138 'accept/reject/all':207 'access':102,210,215,414,510 'across':985 'actual':584 'adjust':1171 'allow':292,521,911 'alway':1011 'amend':767 'analysi':7,43 'analyz':54,149,1116 'api':456 'appear':595 'appli':1085 'approach':901 'apt':1270,1289,1299 'apt-get':1269,1288,1298 'archiv':68 'articl':720 'ask':48 'assum':356 'attent':837 'automat':399 'avail':87,1266 'avoid':1249,1256 'b':987 'back':1048 'bad':636 'balanc':1174 'bash':187,681,1051,1072,1087,1134,1149,1220 'basic':124 'batch':546,552,566,707,752,760,763,768,775,779,797,802,881,906,916,925,965,1041 'best':337 'block':738 'break':597 'busi':139 'chang':12,122,186,194,198,206,268,527,539,544,550,556,579,585,626,654,667,679,685,694,701,712,758,783,795,844,868,879,883,921,933,938,942,952,957,968,999,1030,1076,1083,1107 'check':1062,1103 'claud':20 'claus':936 'code':1236,1242,1248 'comment':13,217,257 'common':505 'complet':311,433,537,821,1043,1065 'complex':218,410,512,785,793 'compon':353 'comprehens':2,525,1061 'concis':1247 'contain':69 'content':38,56,95,150,161,243,255,329,451,1027 'convers':1295 'convert':167,188,673,1044,1067,1110,1119,1130,1138,1143,1190,1203,1231,1308 'convert-to':1137 'copi':869 'correct':778,1086 'creat':31,51,105,109,274,280,295,345,472,988,1157,1283 'creation':4,40,343 'critic':333,535 'current':1021 'current.md':689 'date':777,941 'day':628,631,644,649,665 'debug':559,903 'decis':92 'default':136 'definit':934 'defusedxml':1312,1315 'delet':272,602 'depend':358,367,1260,1262 'detail':331 'differ':83,89 'direct':418,461,508 'directori':1047 'doc':142 'doc.save':1001 'document':3,26,33,107,113,117,120,131,164,169,180,189,220,237,254,278,284,297,342,350,380,386,389,407,454,463,468,480,483,492,496,517,674,697,732,809,840,849,928,962,1003,1038,1066,1069,1111,1118,1285 'document.docx':1141 'document.pdf':1154,1229 'document.xml':260 'docx':1,27,39,59,62,289,373,1050,1131,1244,1277,1281 'docx-j':288 'docx-js.md':307,308 'dom':417,509 'dpi':1170 'e.g':717,734,930,940,951,1191,1204 'easier':904,907 'edit':5,37,41,52,80,115,126,376,382,462,575,590 'effici':563,915 'element':618,986 'els':129 'embed':222,262 'entir':304,426,638,814 'error':910 'essenti':65 'etc':1162 'exampl':625,1216 'excel':176 'exist':116,378,384 'export':371 'extract':18,98,152,616,1276 'f':1185,1192,1225 'featur':230 'file':28,60,63,71,246,250,266,305,324,328,348,427,446,450,464,815,834,1158,1215 'final':491,1057,1068 'find':711,996,1094,1101 'finish':315,437,825 'first':735,959,1187 'format':14,219,334,1178 'full':327,449 'g':1280 'generat':1241 'get':670,993,1020,1271,1290,1300 'good':650 'govern':141 'grep':726,974,1012,1088,1096 'group':548,693,754,882,926 'guidelin':1238 'half':960 'handl':400 'harder':591 'headless':1136 'high':501 'high-level':500 'identifi':691,699,723 'imag':263,1113,1122,1148,1311 'immedi':1014 'implement':531,542,577,878,893,998 'import':1239 'increment':912 'infrastructur':401 'insert':270,603 'instal':360,1263,1272,1279,1291,1301,1314 'introduc':1109 'isol':909 'iv':721 'javascript/typescript':299,347 'jpeg':1147,1151,1175,1177,1222 'js':290 'key':249 'l':1198,1205,1227 'last':1200 'legal':137,946 'level':502 'librari':390,393,398,421,455,481,484,497,841,1004 'libreoffic':1286,1292 'like':1159 'limit':320,442,830 'line':310,432,743,820,1022,1028 'll':232 'locat':708 'logic':706,884 'main':253 'maintain':562,914 'make':558,589,902 'manag':560 'mandatori':302,424,812 'manipul':396,408 'map':748,970 'mark':581,652 'markdown':171,191,529,671,676,742,1071 'may':47 'media':223,265 'metadata':225 'method':405,503,709 'minim':573 'modifi':35 'move':568 'must':541 'n':1186,1199 'name':782,944,1252 'need':21,156,212,233,702 'never':316,438,826 'new':32,106,111,276,282,1284 'next':571 'node':994,997 'note':852,1010 'npm':1278 'number':716,725,744,1023,1029 'o':201,688,1079 'ooxml':125,395,534 'ooxml.md':429,430,487,817,818,1007 'ooxml/scripts/pack.py':494,1053 'ooxml/scripts/unpack.py':248,470,851 'oper':506,1245,1255 'option':203,1163 'organ':703,753 'origin':608,621,656,1089 'output':1176,1214 'output.md':202 'overview':44 'pack':489,1036 'packer.tobuffer':375 'page':799,804,954,1145,1155,1188,1196,1201,1209,1211,1230,1233 'page-1.jpg':1160 'page-2.jpg':1161 'pandoc':173,174,195,682,1073,1267,1273 'paragraph':351,722,736 'pars':1319 'parti':781,943 'path-to-file.docx':200,687 'pattern':459,727,845,1009 'pay':835 'pdf':1133,1140,1144,1294,1309 'pdftoppm':1150,1221,1306 'per':759 'phrase':1090,1098 'pip':1313 'plan':524 'png':1180,1182 'poppler':1296,1303 'poppler-util':1302 'practic':338 'precis':574 'prefer':1184 'prefix':1212 'preserv':15,179,606,655,680 'principl':572 'print':1258 'proceed':340 'process':1128 'profession':25 'progress':913 'provid':175,404,498 'proxim':891,950 'python':247,392,469,476,493,635,850,1052 'quality/size':1173 'r':1152,1164,1223 'rang':319,441,829,1219 'raw':100,208,213,241 'read':78,147,158,239,303,306,322,325,425,428,444,447,808,813,816,832 'reading/analyzing':94 'recommend':135 'redlin':133,144,514 'redund':1254 'refer':364 'referenc':258 'relat':549,757,967 'repeat':586 'replac':598,637,790,948,1097 'represent':672 'requir':146,1261 'resolut':1167 'resourc':74 'reus':623 'review':518,593,695 'reviewed-document.docx':1055,1078 'rsid':611,855,862,871 'rule':335 'run':474,609,990,1034 'scenario':411,513 'scratch':286 'script':477,858,899,991,1018,1033 'section':103,368,485,718,762,765,770,846,886,929,931,1005 'section/heading':715 'secur':1317 'see':482,1002 'sentenc':634,639 'sequenti':796 'set':317,439,827,1166 'setup':402 'show':184 'signatur':737 'simpl':121,788 'singl':898 'size':917 'skill' 'skill-docx' 'smaller':905 'soffic':1135 'someon':128 'source-rkz91' 'special':836 'specif':1218 'split':984 'start':313,435,786,823,1194 'statement':1259 'step':875,1127 'stop':1207 'strategi':547 'structur':181,221,251,733,751,794 'style':1237 'sudo':1268,1287,1297 'suggest':854,860,924 'support':9,177 'surround':730 'syntax':332 'systemat':545 'tackl':792 'tag':273 'task':90 'term':641,646,661,947 'termin':935 'test':564 'text':17,97,151,160,582,588,601,605,614,659,731,789,971,976,982,1026,1275 'textrun':352 'togeth':895 'tool':84 'topic-agent-skills' 'topic-agents-md' 'topic-ai-agents' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-developer-tools' 'topic-llm-tools' 'topic-mcp' 'topic-pm-tools' 'topic-product-management' 'topic-productivity' 'track':11,185,193,197,205,267,526,538,578,666,678,684,843,867,1075 'track-chang':196,204,683,1074 'tree':93 'two':1126 'two-step':1125 'type':774,888,939 'unchang':587,600,604,613,658 'under':416 'unintend':1106 'uniqu':729 'unnecessari':1257 'unpack':235,244,466,811,847,857,1046,1054 'unprofession':596 'updat':772,945 'use':96,108,123,132,143,172,269,287,298,349,374,387,478,528,741,864,873,992,1123,1179 'user':46 'util':1304 'variabl':1251 'verbos':1250 'verif':1058 'verifi':980,1025,1081 'verification.md':1080,1091,1099 'visual':1115 'well':923 'word':112,277,283,296,379,385,1117 'word/comments.xml':256 'word/document.xml':252,978,1013 'word/media':261 'work':23,922 'workflow':86,91,114,127,134,145,300,422,515,520,668 'write':1016,1246 'xml':70,101,209,214,242,458,714,750,973,1318 'zip':67","prices":[{"id":"f871c4e5-fef7-4b27-a725-c65869775019","listingId":"7e71c001-fc27-4289-952e-046bf1415afc","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"rkz91","category":"coco","install_from":"skills.sh"},"createdAt":"2026-05-18T13:21:38.817Z"}],"sources":[{"listingId":"7e71c001-fc27-4289-952e-046bf1415afc","source":"github","sourceId":"rkz91/coco/docx","sourceUrl":"https://github.com/rkz91/coco/tree/main/skills/docx","isPrimary":false,"firstSeenAt":"2026-05-18T13:21:38.817Z","lastSeenAt":"2026-05-18T19:14:06.704Z"}],"details":{"listingId":"7e71c001-fc27-4289-952e-046bf1415afc","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"rkz91","slug":"docx","github":{"repo":"rkz91/coco","stars":7,"topics":["agent-skills","agents-md","ai","ai-agents","claude-code","codex","cursor","developer-tools","llm-tools","mcp","pm-tools","product-management","productivity","prompt-engineering","workflow-automation"],"license":"mit","html_url":"https://github.com/rkz91/coco","pushed_at":"2026-04-26T01:51:27Z","description":"Open-source library of AI superpowers — 59 skills, 34 commands, 10 agents + 24 GSD subagents, 3 system bundles. An entire team, wherever your AI lives. Vendor-neutral across Claude Code, Cursor, Codex, and any AGENTS.md tool.","skill_md_sha":"664663895bcd11b88a632301d830b313cbabb845","skill_md_path":"skills/docx/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/rkz91/coco/tree/main/skills/docx"},"layout":"multi","source":"github","category":"coco","frontmatter":{"name":"docx","license":"Proprietary. LICENSE.txt has complete terms","description":"Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When Claude needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks"},"skills_sh_url":"https://skills.sh/rkz91/coco/docx"},"updatedAt":"2026-05-18T19:14:06.704Z"}}