asta-skill
Domain expertise for Ai2 Asta MCP tools (Semantic Scholar corpus). Intent-to-tool routing, safe defaults, workflow patterns, and pitfall warnings for academic paper search, citation traversal, and author discovery.
What it does
Asta MCP — Academic Paper Search
Asta is Ai2's Scientific Corpus Tool, exposing the Semantic Scholar academic graph over MCP (streamable HTTP transport). This skill tells agents which Asta tool to call for which intent, and how to compose them into useful workflows.
- MCP endpoint:
https://asta-tools.allen.ai/mcp/v1 - Auth:
x-api-keyheader (request key at https://share.hsforms.com/1L4hUh20oT3mu8iXJQMV77w3ioxm) - Transport: streamable HTTP
Update Notifications
Once per day, check for updates without modifying any skill files:
- Read the timestamp in
<skill_directory>/.last_version_check(treat missing as expired). - If older than 24 hours:
- Fetch
https://raw.githubusercontent.com/Agents365-ai/asta-skill/main/SKILL.md - Parse the
versionfield from its frontmatter - Compare with the local version in this file
- Write the current Unix timestamp to
.last_version_check
- Fetch
- If a newer version exists, tell the user once:
📢 asta-skill has a new version (local X.Y.Z → remote A.B.C). Run
git -C <skill_directory> pullto update. - On any failure (offline, fetch error, parse error), silently skip until the next day.
Never run git pull automatically. The check is read-only over the network and only writes a timestamp file locally — the user must explicitly update.
Prerequisite Check
Before invoking any tool, verify the Asta MCP server is registered in the host agent. Tool names will be prefixed by the MCP server name chosen at install time (commonly asta__<tool> or mcp__asta__<tool>). If no Asta tools are visible, direct the user to the Installation section below.
Tool Map — Intent → Asta Tool
| User intent | Asta tool | Notes |
|---|---|---|
| Broad topic search | search_papers_by_relevance | Supports venue + date filters |
| Known paper title | search_paper_by_title | Optional venue restriction |
| Known DOI / arXiv / PMID / CorpusId / MAG / ACL / SHA / URL | get_paper | Single-paper lookup |
| Multiple known IDs at once | get_paper_batch | Batch lookup — prefer over N sequential get_paper calls |
| Who cited paper X | get_citations | Citation traversal with filters, paginated |
| Find author by name | search_authors_by_name | Returns profile info |
| An author's publications | get_author_papers | Pass author id from previous call |
| Find passages mentioning X | snippet_search | ~500-word excerpts from paper bodies |
All tools accept date-range filters and field selection — pass them whenever the user's intent constrains scope (e.g., "recent", "since 2022", "at NeurIPS").
⚠️ fields parameter — avoid context blowups
get_paper / get_paper_batch accept a fields string. Never request citations or references via fields — a single highly-cited paper (e.g. Attention Is All You Need) returns 200k+ characters and will overflow the agent's context window. Use the dedicated get_citations tool for forward citations (it paginates). Asta does not provide a dedicated get_references tool — to retrieve a paper's reference list, use get_paper with fields=references only for papers you know have a small reference list (typically < 100).
Safe default fields for get_paper:
title,year,authors,venue,tldr,url,abstract
Add journal, publicationDate, fieldsOfStudy, isOpenAccess only when needed.
Workflow Patterns
Pattern 1 — Topic Discovery
search_papers_by_relevance(query, year="<current_year-5>-", venue=?)→ initial hits (compute the lower bound from today's date — e.g., in 2026 passyear="2021-"; adjust or drop the filter if the user asks for older work)- Rank/present top N by citationCount + recency
- Offer follow-ups:
get_citationson the most influential, orsnippet_searchfor specific claims
Pattern 2 — Seed-Paper Expansion
get_paper(DOI|arXiv|...)→ verify seedget_citations(paperId)→ forward expansion- Optionally
search_papers_by_relevancewith seed title terms for sideways discovery - Deduplicate by paperId before presenting
Pattern 3 — Author Deep-Dive
search_authors_by_name(name)→ pick correct profile (disambiguate by affiliation)get_author_papers(authorId)→ full publication list- Filter client-side by topic keywords or date
Pattern 4 — Evidence Retrieval
snippet_search(claim_query)→ find passages making/supporting a claim- For each hit, optionally
get_paper(id)for full metadata
Output & Interaction Rules
- Always report total count and which tool was used.
- Present top 10 as a table (title, year, venue, citations), then details for the most relevant.
- If the user writes in Chinese, present summaries in Chinese; keep titles in original language.
- After results, offer: Details / Refine / Citations / Snippet / Export / Done.
Critical Rules
- Prefer batched intent over ping-pong. If the user's question needs two independent lookups, issue them as parallel MCP tool calls in one turn, not sequentially.
- Never guess IDs. If a user gives a fuzzy title, use
search_paper_by_titlebeforeget_paper. - Respect rate limits. An API key buys higher limits but not unlimited — stop expanding citation graphs beyond what the user asked for.
- Do not fabricate fields. If Asta returns null
abstractorvenue, say so rather than inventing.
Handling Asta responses
| Situation | What to do |
|---|---|
Empty abstract | Not all corpus papers have full text — use snippet_search, or fall back to title + TLDR |
| Author disambiguation uncertain | Inspect affiliations in search_authors_by_name results before calling get_author_papers |
429 Too Many Requests | Back off; batch with get_paper_batch instead of sequential get_paper calls |
Capabilities
Install
Quality
deterministic score 0.47 from registry signals: · indexed on github topic:agent-skills · 48 github stars · SKILL.md body (5,808 chars)