{"id":"71a682a6-a288-4da8-9951-6327e072da8f","shortId":"gqkGvQ","kind":"skill","title":"Arize Annotation","tagline":"Awesome Copilot skill by Github","description":"# Arize Annotation Skill\n\nThis skill focuses on **annotation configs** — the schema for human feedback — and on **programmatically annotating project spans** via the Python SDK. Human review in the Arize UI (including annotation queues, datasets, and experiments) still depends on these configs; there is no `ax` CLI for queues yet.\n\n**Direction:** Human labeling in Arize attaches values defined by configs to **spans**, **dataset examples**, **experiment-related records**, and **queue items** in the product UI. What is documented here: `ax annotation-configs` and bulk span updates with `ArizeClient.spans.update_annotations`.\n\n---\n\n## Prerequisites\n\nProceed directly with the task — run the `ax` command you need. Do NOT check versions, env vars, or profiles upfront.\n\nIf an `ax` command fails, troubleshoot based on the error:\n- `command not found` or version error → see references/ax-setup.md\n- `401 Unauthorized` / missing API key → run `ax profiles show` to inspect the current profile. If the profile is missing or the API key is wrong: check `.env` for `ARIZE_API_KEY` and use it to create/update the profile via references/ax-profiles.md. If `.env` has no key either, ask the user for their Arize API key (https://app.arize.com/admin > API Keys)\n- Space ID unknown → check `.env` for `ARIZE_SPACE_ID`, or run `ax spaces list -o json`, or ask the user\n\n---\n\n## Concepts\n\n### What is an Annotation Config?\n\nAn **annotation config** defines the schema for a single type of human feedback label. Before anyone can annotate a span, dataset record, experiment output, or queue item, a config must exist for that label in the space.\n\n| Field | Description |\n|-------|-------------|\n| **Name** | Descriptive identifier (e.g. `Correctness`, `Helpfulness`). Must be unique within the space. |\n| **Type** | `categorical` (pick from a list), `continuous` (numeric range), or `freeform` (free text). |\n| **Values** | For categorical: array of `{\"label\": str, \"score\": number}` pairs. |\n| **Min/Max Score** | For continuous: numeric bounds. |\n| **Optimization Direction** | Whether higher scores are better (`maximize`) or worse (`minimize`). Used to render trends in the UI. |\n\n### Where labels get applied (surfaces)\n\n| Surface | Typical path |\n|---------|----------------|\n| **Project spans** | Python SDK `spans.update_annotations` (below) and/or the Arize UI |\n| **Dataset examples** | Arize UI (human labeling flows); configs must exist in the space |\n| **Experiment outputs** | Often reviewed alongside datasets or traces in the UI — see arize-experiment, arize-dataset |\n| **Annotation queue items** | Arize UI; configs must exist — no `ax` queue commands documented here yet |\n\nAlways ensure the relevant **annotation config** exists in the space before expecting labels to persist.\n\n---\n\n## Basic CRUD: Annotation Configs\n\n### List\n\n```bash\nax annotation-configs list --space-id SPACE_ID\nax annotation-configs list --space-id SPACE_ID -o json\nax annotation-configs list --space-id SPACE_ID --limit 20\n```\n\n### Create — Categorical\n\nCategorical configs present a fixed set of labels for reviewers to choose from.\n\n```bash\nax annotation-configs create \\\n  --name \"Correctness\" \\\n  --space-id SPACE_ID \\\n  --type categorical \\\n  --values '[{\"label\": \"correct\", \"score\": 1}, {\"label\": \"incorrect\", \"score\": 0}]' \\\n  --optimization-direction maximize\n```\n\nCommon binary label pairs:\n- `correct` / `incorrect`\n- `helpful` / `unhelpful`\n- `safe` / `unsafe`\n- `relevant` / `irrelevant`\n- `pass` / `fail`\n\n### Create — Continuous\n\nContinuous configs let reviewers enter a numeric score within a defined range.\n\n```bash\nax annotation-configs create \\\n  --name \"Quality Score\" \\\n  --space-id SPACE_ID \\\n  --type continuous \\\n  --minimum-score 0 \\\n  --maximum-score 10 \\\n  --optimization-direction maximize\n```\n\n### Create — Freeform\n\nFreeform configs collect open-ended text feedback. No additional flags needed beyond name, space, and type.\n\n```bash\nax annotation-configs create \\\n  --name \"Reviewer Notes\" \\\n  --space-id SPACE_ID \\\n  --type freeform\n```\n\n### Get\n\n```bash\nax annotation-configs get ANNOTATION_CONFIG_ID\nax annotation-configs get ANNOTATION_CONFIG_ID -o json\n```\n\n### Delete\n\n```bash\nax annotation-configs delete ANNOTATION_CONFIG_ID\nax annotation-configs delete ANNOTATION_CONFIG_ID --force   # skip confirmation\n```\n\n**Note:** Deletion is irreversible. Any annotation queue associations to this config are also removed in the product (queues may remain; fix associations in the Arize UI if needed).\n\n---\n\n## Applying Annotations to Spans (Python SDK)\n\nUse the Python SDK to bulk-apply annotations to **project spans** when you already have labels (e.g., from a review export or an external labeling tool).\n\n```python\nimport pandas as pd\nfrom arize import ArizeClient\n\nimport os\n\nclient = ArizeClient(api_key=os.environ[\"ARIZE_API_KEY\"])\n\n# Build a DataFrame with annotation columns\n# Required: context.span_id + at least one annotation.<name>.label or annotation.<name>.score\nannotations_df = pd.DataFrame([\n    {\n        \"context.span_id\": \"span_001\",\n        \"annotation.Correctness.label\": \"correct\",\n        \"annotation.Correctness.updated_by\": \"reviewer@example.com\",\n    },\n    {\n        \"context.span_id\": \"span_002\",\n        \"annotation.Correctness.label\": \"incorrect\",\n        \"annotation.Correctness.updated_by\": \"reviewer@example.com\",\n    },\n])\n\nresponse = client.spans.update_annotations(\n    space_id=os.environ[\"ARIZE_SPACE_ID\"],\n    project_name=\"your-project\",\n    dataframe=annotations_df,\n    validate=True,\n)\n```\n\n**DataFrame column schema:**\n\n| Column | Required | Description |\n|--------|----------|-------------|\n| `context.span_id` | yes | The span to annotate |\n| `annotation.<name>.label` | one of | Categorical or freeform label |\n| `annotation.<name>.score` | one of | Numeric score |\n| `annotation.<name>.updated_by` | no | Annotator identifier (email or name) |\n| `annotation.<name>.updated_at` | no | Timestamp in milliseconds since epoch |\n| `annotation.notes` | no | Freeform notes on the span |\n\n**Limitation:** Annotations apply only to spans within 31 days prior to submission.\n\n---\n\n## Troubleshooting\n\n| Problem | Solution |\n|---------|----------|\n| `ax: command not found` | See references/ax-setup.md |\n| `401 Unauthorized` | API key may not have access to this space. Verify at https://app.arize.com/admin > API Keys |\n| `Annotation config not found` | `ax annotation-configs list --space-id SPACE_ID` |\n| `409 Conflict on create` | Name already exists in the space. Use a different name or get the existing config ID. |\n| Human review / queues in UI | Use the Arize app; ensure configs exist — no `ax` annotation-queue CLI yet |\n| Span SDK errors or missing spans | Confirm `project_name`, `space_id`, and span IDs; use arize-trace to export spans |\n\n---\n\n## Related Skills\n\n- **arize-trace**: Export spans to find span IDs and time ranges\n- **arize-dataset**: Find dataset IDs and example IDs\n- **arize-evaluator**: Automated LLM-as-judge alongside human annotation\n- **arize-experiment**: Experiments tied to datasets and evaluation workflows\n- **arize-link**: Deep links to annotation configs and queues in the Arize UI\n\n---\n\n## Save Credentials for Future Use\n\nSee references/ax-profiles.md § Save Credentials for Future Use.","tags":["arize","annotation","awesome","copilot","github"],"capabilities":["skill","source-github","category-awesome-copilot"],"categories":["awesome-copilot"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/github/awesome-copilot/arize-annotation","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"install_from":"skills.sh"}},"qualityScore":"0.300","qualityRationale":"deterministic score 0.30 from registry signals: · indexed on skills.sh · published under github/awesome-copilot","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill:v1","enrichmentVersion":1,"enrichedAt":"2026-04-22T03:40:38.224Z","embedding":null,"createdAt":"2026-04-18T20:36:12.853Z","updatedAt":"2026-04-22T03:40:38.224Z","lastSeenAt":"2026-04-22T03:40:38.224Z","tsv":"'/admin':192,839 '0':477,529 '001':717 '002':726 '1':473 '10':533 '20':438 '31':810 '401':136,824 '409':856 'access':831 'addit':549 'alongsid':355,947 'alreadi':662,861 'also':626 'alway':384 'and/or':334 'annot':2,9,15,25,39,88,96,219,222,238,332,369,388,401,407,417,429,457,513,560,577,580,585,588,597,600,605,608,619,643,656,698,706,709,711,734,747,763,764,772,778,782,787,804,842,848,891,949,966 'annotation-config':87,406,416,428,456,512,559,576,584,596,604,847 'annotation-queu':890 'annotation.correctness.label':718,727 'annotation.correctness.updated':720,729 'annotation.notes':796 'anyon':236 'api':139,157,165,188,193,688,692,826,840 'app':884 'app.arize.com':191,838 'app.arize.com/admin':190,837 'appli':322,642,655,805 'ariz':1,8,36,61,164,187,201,336,340,364,367,372,638,681,691,738,883,911,919,931,940,951,961,972 'arize-dataset':366,930 'arize-evalu':939 'arize-experi':363,950 'arize-link':960 'arize-trac':910,918 'arizecli':683,687 'arizeclient.spans.update':95 'array':288 'ask':182,212 'associ':621,635 'attach':62 'autom':942 'awesom':3 'ax':52,86,105,120,142,206,378,405,415,427,455,511,558,575,583,595,603,818,846,889 'base':124 'bash':404,454,510,557,574,594 'basic':399 'better':307 'beyond':552 'binari':483 'bound':300 'build':694 'bulk':91,654 'bulk-appli':653 'categor':273,287,440,441,468,768 'category-awesome-copilot' 'check':111,161,198 'choos':452 'cli':53,893 'client':686 'client.spans.update':733 'collect':542 'column':699,752,754 'command':106,121,128,380,819 'common':482 'concept':215 'config':16,48,66,89,220,223,249,345,374,389,402,408,418,430,442,458,499,514,541,561,578,581,586,589,598,601,606,609,624,843,849,874,886,967 'confirm':613,901 'conflict':857 'context.span':701,714,723,757 'continu':278,298,497,498,525 'copilot':4 'correct':264,461,471,486,719 'creat':439,459,496,515,538,562,859 'create/update':171 'credenti':975,982 'crud':400 'current':148 'datafram':696,746,751 'dataset':41,69,241,338,356,368,932,934,956 'day':811 'deep':963 'defin':64,224,508 'delet':593,599,607,615 'depend':45 'descript':259,261,756 'df':712,748 'differ':868 'direct':57,99,302,480,536 'document':84,381 'e.g':263,665 'either':181 'email':784 'end':545 'ensur':385,885 'enter':502 'env':113,162,177,199 'epoch':795 'error':127,133,897 'evalu':941,958 'exampl':70,339,937 'exist':251,347,376,390,862,873,887 'expect':395 'experi':43,72,243,351,365,952,953 'experiment-rel':71 'export':669,914,921 'extern':672 'fail':122,495 'feedback':21,233,547 'field':258 'find':924,933 'fix':445,634 'flag':550 'flow':344 'focus':13 'forc':611 'found':130,821,845 'free':283 'freeform':282,539,540,572,770,798 'futur':977,984 'get':321,573,579,587,871 'github':7 'help':265,488 'higher':304 'human':20,32,58,232,342,876,948 'id':196,203,412,414,422,424,434,436,464,466,521,523,568,570,582,590,602,610,702,715,724,736,740,758,853,855,875,905,908,926,935,938 'identifi':262,783 'import':676,682,684 'includ':38 'incorrect':475,487,728 'inspect':146 'irrelev':493 'irrevers':617 'item':77,247,371 'json':210,426,592 'judg':946 'key':140,158,166,180,189,194,689,693,827,841 'label':59,234,254,290,320,343,396,448,470,474,484,664,673,707,765,771 'least':704 'let':500 'limit':437,803 'link':962,964 'list':208,277,403,409,419,431,850 'llm':944 'llm-as-judg':943 'maxim':308,481,537 'maximum':531 'maximum-scor':530 'may':632,828 'millisecond':793 'min/max':295 'minim':311 'minimum':527 'minimum-scor':526 'miss':138,154,899 'must':250,266,346,375 'name':260,460,516,553,563,742,786,860,869,903 'need':108,551,641 'note':565,614,799 'number':293 'numer':279,299,504,776 'o':209,425,591 'often':353 'one':705,766,774 'open':544 'open-end':543 'optim':301,479,535 'optimization-direct':478,534 'os':685 'os.environ':690,737 'output':244,352 'pair':294,485 'panda':677 'pass':494 'path':326 'pd':679 'pd.dataframe':713 'persist':398 'pick':274 'prerequisit':97 'present':443 'prior':812 'problem':816 'proceed':98 'product':80,630 'profil':116,143,149,152,173 'programmat':24 'project':26,327,658,741,745,902 'python':30,329,646,650,675 'qualiti':517 'queue':40,55,76,246,370,379,620,631,878,892,969 'rang':280,509,929 'record':74,242 'references/ax-profiles.md':175,980 'references/ax-setup.md':135,823 'relat':73,916 'relev':387,492 'remain':633 'remov':627 'render':314 'requir':700,755 'respons':732 'review':33,354,450,501,564,668,877 'reviewer@example.com':722,731 'run':103,141,205 'safe':490 'save':974,981 'schema':18,226,753 'score':292,296,305,472,476,505,518,528,532,710,773,777 'sdk':31,330,647,651,896 'see':134,362,822,979 'set':446 'show':144 'sinc':794 'singl':229 'skill':5,10,12,917 'skip':612 'solut':817 'source-github' 'space':195,202,207,257,271,350,393,411,413,421,423,433,435,463,465,520,522,554,567,569,735,739,834,852,854,865,904 'space-id':410,420,432,462,519,566,851 'span':27,68,92,240,328,645,659,716,725,761,802,808,895,900,907,915,922,925 'spans.update':331 'still':44 'str':291 'submiss':814 'surfac':323,324 'task':102 'text':284,546 'tie':954 'time':928 'timestamp':791 'tool':674 'trace':358,912,920 'trend':315 'troubleshoot':123,815 'true':750 'type':230,272,467,524,556,571 'typic':325 'ui':37,81,318,337,341,361,373,639,880,973 'unauthor':137,825 'unhelp':489 'uniqu':268 'unknown':197 'unsaf':491 'updat':93,779,788 'upfront':117 'use':168,312,648,866,881,909,978,985 'user':184,214 'valid':749 'valu':63,285,469 'var':114 'verifi':835 'version':112,132 'via':28,174 'whether':303 'within':269,506,809 'workflow':959 'wors':310 'wrong':160 'yes':759 'yet':56,383,894 'your-project':743","prices":[{"id":"fe62a155-e107-421b-a346-7ca72a9ece69","listingId":"71a682a6-a288-4da8-9951-6327e072da8f","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"github","category":"awesome-copilot","install_from":"skills.sh"},"createdAt":"2026-04-18T20:36:12.853Z"}],"sources":[{"listingId":"71a682a6-a288-4da8-9951-6327e072da8f","source":"github","sourceId":"github/awesome-copilot/arize-annotation","sourceUrl":"https://github.com/github/awesome-copilot/tree/main/skills/arize-annotation","isPrimary":false,"firstSeenAt":"2026-04-18T21:48:13.334Z","lastSeenAt":"2026-04-22T00:52:03.595Z"},{"listingId":"71a682a6-a288-4da8-9951-6327e072da8f","source":"skills_sh","sourceId":"github/awesome-copilot/arize-annotation","sourceUrl":"https://skills.sh/github/awesome-copilot/arize-annotation","isPrimary":true,"firstSeenAt":"2026-04-18T20:36:12.853Z","lastSeenAt":"2026-04-22T03:40:38.224Z"}],"details":{"listingId":"71a682a6-a288-4da8-9951-6327e072da8f","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"github","slug":"arize-annotation","source":"skills_sh","category":"awesome-copilot","skills_sh_url":"https://skills.sh/github/awesome-copilot/arize-annotation"},"updatedAt":"2026-04-22T03:40:38.224Z"}}