sentry-instrumentation
Rules and examples for adding Sentry metrics the right way. Covers how to name a counter, gauge, or duration metric; which tags are safe versus which will blow up your Sentry bill; how to track failures with a small fixed list of error types instead of raw exception strings; and
What it does
Sentry Instrumentation
Scope: Sentry (system behavior) only. Product analytics metrics (button clicks, funnel events, feature-flag exposure) belong in your product-analytics tool — never mix them into Sentry instrumentation. If the change touches a user-facing funnel event, stop and use the right tool for it.
Canonical reference: Python, under examples/python/. The rules
in references/ are language-neutral — ports to TypeScript, Go,
Ruby, Java, etc. keep the same shapes (same constructors, same 13
CI checks, same FailureClass taxonomy) under idiomatic names. When
no reference implementation exists for the target language yet, use
Python as the architectural spec and port the shapes.
Where to start (5 bullets for the agent)
- Identify the project's language and workflow conventions. Look
for
pyproject.toml/package.json/go.mod/Gemfile/pom.xml. Note the test runner, linter, and CI wiring — they're how the instrumentation gate will be enforced. - Grep for an existing observability layer first. Extend it
rather than creating a parallel one. A file named
observability.py/observability.ts/observability.go, or ametrics/package, is the prime signal. - Pick the constructor that matches the metric's purpose (see
references/signal-model.md+references/metric-classes.md). Do not open-code a counter / distribution / gauge literal. - Use the matching surface pattern (middleware, decorator, base
class) from
references/surface-patterns.md— don't hand-roll emissions at call sites. Using the pattern is always less code. - Register in the project's
MetricDefregistry. CI gate enforces identity + lifecycle rules on the registry contents.
Charter (read before writing any metric)
Every metric emitted by this service must be:
- Semantically precise — exactly one of five purposes (outcome / latency / load / resource / correctness), exactly one kind (counter / gauge / distribution), a mandatory unit.
- Bounded — tag values come from a small enumerated set or an approved bucket function. No raw user ids, URLs, exception strings, or timestamps.
- Enforceable — defined once as a
MetricDef, emitted via the validating helper API, checked by CI against the registry before merge. - Versioned — identity tuple is immutable under the same name.
Meaning change = new name with a
.v2suffix, 14-day overlap,retired_atdate. - Cost-aware — declares its
emit_frequency,sampling_rate,max_rate_hz, andloop_policy. Distributions on hot paths sample; counters in loops aggregate. - Ergonomic — the correct path is the easiest path. Build
MetricDefs through.counter/.latency/.gauge/.resource/.failure_counter. Emit throughemit_counter/emit_latency/emit_failure/time_latency. Use the surface patterns (ObservabilityMiddleware,InstrumentedHttpClient,@instrumented_step,retry_with_instrumentation,record_fallback) — they bake in the right emissions so call sites never hand-roll them.
Does NOT define: dashboards, alert rules, SLO thresholds, on-call policy, or product analytics. Those depend on this layer being clean.
When this skill applies (auto-invocation triggers)
Invoke for any change that:
- Adds or modifies code that emits a Sentry metric.
- Measures a duration, counts failures, or reports a resource amount (tokens, bytes, API units).
- Wraps a workflow step (Hatchet, Celery, Temporal, Sidekiq, Inngest, BullMQ, or equivalent).
- Adds a route, middleware, or external-API client.
- Adds a retry loop, fallback path, or degradation branch.
- Contains the words "instrument", "emit a metric", "add a gauge/counter/distribution", "add a span", "observe", "track system behavior", "record timing", or "count failures".
Do not invoke for product-analytics changes (button click counts, funnel events, feature-flag exposure). Those belong in your product- analytics tool, not in Sentry instrumentation.
Decision rules
- New metric? Read
references/signal-model.md+ pick a classmethod constructor (MetricDef.counter|latency|gauge|resource|failure_counter). Register in the project's metric registry. Never call an emission helper with a raw string or a dynamically-assembled name. - Tag values? Either enumerate them in
MetricDef.tag_constraintsor route through a bucket function fromreferences/tagging-and-cardinality.md. - Inside a loop? Use
AggregatingCounterorDurationAccumulator(seereferences/cost-model.md). If the metric'sloop_policyis"forbidden"the CI gate refuses any emission inside afor/whilebody for that metric. - New surface (HTTP route / external API / workflow step / retry /
fallback)? Use the matching reusable pattern from
references/surface-patterns.md. Don't hand-roll the emissions. - Changing a metric's meaning, unit, or tag shape? It's a new
versioned metric. See
references/naming-and-lifecycle.md. - Failure counter? Build with
MetricDef.failure_counter(...)and emit withemit_failure(metric, failure=classify(exc), tags=...). Never passstr(exc)as a tag. Seereferences/failure-taxonomy.md.
Detect-or-create
Detect the project language first, then look for an existing
observability layer matching that language's conventions. If you find
one, extend it. If not, scaffold from the matching example under
examples/<language>/ and rename yourapp to the project's package
root.
pyproject.toml / setup.py → Python. Use examples/python/.
package.json (TS or JS) → TypeScript/JavaScript. v0.2 — port from
examples/python/ shapes.
go.mod → Go. v0.2 — port from examples/python/
shapes.
Gemfile → Ruby. port from examples/python/ shapes.
pom.xml / build.gradle → Java/Kotlin. port from examples/python/
shapes.
For ports: preserve the five constructors, the FailureClass
taxonomy values, the 13 CI gate checks, and the emission-boundary
rules. Names become idiomatic (emit_counter → emitCounter,
@instrumented_step → instrumentedStep(fn) higher-order fn, etc.).
Sections (detailed references)
| Topic | Reference | Example |
|---|---|---|
| Charter & scope | references/charter.md | — |
MetricDef schema + constructors | references/signal-model.md | examples/python/metric_def.py |
| Five metric classes by purpose | references/metric-classes.md | — |
| Kind semantic rules (counter/gauge/distribution) | references/semantic-rules.md | — |
| Naming + lifecycle (version suffix, retired_at) | references/naming-and-lifecycle.md | — |
| Tagging + cardinality policy + bucket fns | references/tagging-and-cardinality.md | examples/python/metric_tags.py |
| Cost model (sampling, rate limit, aggregation) | references/cost-model.md | examples/python/emission_module.py |
| Emission boundaries (where to emit) | references/emission-boundaries.md | — |
Failure taxonomy (FailureClass + classify) | references/failure-taxonomy.md | examples/python/failure_taxonomy.py |
| Reusable surface patterns | references/surface-patterns.md | examples/python/http_middleware.py, examples/python/external_api_client.py, examples/python/workflow_decorator.py, examples/python/retry_loop.py, examples/python/fallback_path.py |
| Emission helpers + validators | — | examples/python/emission_module.py |
| CI enforcement gate (13 AST checks) | references/enforcement.md | examples/python/ci_gate.py |
| Test gates | references/enforcement.md | examples/python/test_gates.py |
| PR review rubric | references/review-rubric.md | — |
Project-specific overrides
On first use in a new project, fill these in once so subsequent
invocations know where to land code. The skill's example files use
yourapp placeholders; replace with the actual package root.
Python (canonical reference — v0.1)
Emission module: yourapp/observability.py
Registry: yourapp/shared/metrics.py
Tag buckets: yourapp/shared/metric_tags.py
Failure taxonomy: yourapp/shared/failure_taxonomy.py
HTTP middleware: yourapp/middleware/observability.py
Workflow decorator: yourapp/services/<workflow>/instrumentation.py
External API base: yourapp/services/providers/instrumented_http_client.py
Retry helper: yourapp/services/retry.py
Fallback helper: yourapp/observability.py (or yourapp/shared/fallback.py)
CI gate: scripts/check_metrics.py
TypeScript / Node (v0.2 — port from Python shapes)
Emission module: src/observability.ts
Registry: src/shared/metrics.ts
Tag buckets: src/shared/metricTags.ts
Failure taxonomy: src/shared/failureTaxonomy.ts
HTTP middleware: src/middleware/observability.ts (Express/Koa)
src/fastify-plugins/observability.ts (Fastify)
Workflow pattern: src/workflows/<workflow>/instrumentation.ts
External API base: src/providers/instrumentedHttpClient.ts
CI gate: scripts/check-metrics.ts (ts-morph / ast-grep)
Go (v0.2 — port from Python shapes)
Emission package: internal/observability/metrics.go
Registry: internal/metrics/registry.go
Tag buckets: internal/metrics/tags.go
Failure taxonomy: internal/metrics/failure.go
HTTP middleware: internal/middleware/observability.go (net/http / chi / echo)
Worker pattern: internal/workers/<worker>/instrumentation.go
External API: internal/providers/roundtripper.go (http.RoundTripper wrapper)
CI gate: scripts/check_metrics.go (go/ast)
Quality-gate checklist
Before finalizing a PR that touches instrumentation, walk the review
rubric (full version in references/review-rubric.md):
- Right
kindfor the meaning (counter / gauge / distribution)? - Name matches
<domain>.<object>.<action>[.<type-suffix>]and fitspurpose? - All tag keys in
MetricDef.allowed_tags; values enumerated or from an approved bucket function? - Emission at a documented boundary / uses a canonical surface pattern?
- Duplicative with an existing
MetricDef? (Search the registry.) -
operational_meaningunambiguous; will it be interpretable in six months? -
cardinality="medium"justified inmeans=? - Failure metric uses
emit_failure(...)+ aFailureClassvalue? - Inside a loop → uses
AggregatingCounter/DurationAccumulator? - Hot path →
sampling_rate/max_rate_hzset? - Changing an existing metric → is it actually a new versioned
entry, with
retired_aton the old one?
Capabilities
Install
Quality
deterministic score 0.46 from registry signals: · indexed on github topic:agent-skills · 13 github stars · SKILL.md body (10,885 chars)