Skillquality 0.46

twin-test

GAN-style identity verification -- tests clone fidelity by comparing clone responses against real user messages. Run /twin-test to start a blind taste test, or /twin-test score to see your fidelity score over time.

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/project-nomos/nomos/twin-test

What it does

Twin Test -- Adversarial Clone Fidelity Check

A blind taste test where the clone generates responses to the same contexts as real user messages, then a discriminator identifies which is real and which is the clone. Specific style corrections feed back into the user model.

How It Works

Sample -- Pull 3-5 real sent messages from memory (the "ground truth")
Generate -- For each message, generate a clone response to the same context
Discriminate -- A separate agent compares pairs and identifies the real message
Score -- Calculate fidelity (% of times the discriminator is fooled)
Correct -- Extract specific style corrections from discriminator feedback

Commands

/twin-test -- Run a full twin test session (3-5 message pairs)
/twin-test score -- Show fidelity score history

Session Protocol

When the user invokes /twin-test, follow this exact protocol:

Phase 1: Sample Selection

Call memory_search with category "exemplar" to find high-quality real messages
If no exemplars, search for sent messages in memory (source: "sent", direction: "outgoing")
Select 3-5 diverse messages that cover different contexts (work, casual, technical)
For each message, extract the conversational context (what was the user replying to?)

Phase 2: Clone Generation

For each sampled message:

Reconstruct the context: who was the user talking to, what was the conversation about
Generate YOUR response to the same context, using all personality data (user model, style profiles, exemplars, values)
Try your absolute best to match the user's voice -- this is the test

Important: generate responses BEFORE showing the user any results. Do not look at the real message while generating.

Phase 3: Discrimination

For each message pair (real + clone):

Present both messages to yourself in randomized order (A and B)
Act as a discriminator: which message is the real user and which is the clone?
Explain your reasoning: what specific markers distinguish the messages?
Note the confidence of your assessment

Phase 4: Results

Present a summary:

Twin Test Results
=================

Fidelity Score: XX% (X/Y pairs where discriminator was fooled)

Pair 1: [context summary]
  Real: "..." (correctly/incorrectly identified)
  Clone: "..."
  Discriminator notes: [what gave it away]

Pair 2: ...

Style Corrections:
- [specific corrections based on discriminator feedback]

Phase 5: Corrections

For each pair where the discriminator correctly identified the clone:

Extract what was different (tone, word choice, length, punctuation, emoji usage, formality)
Store these as corrections in the user model at confidence 0.8
Update style profiles if specific style markers were identified
Tell the user what you learned

Phase 6: User Review

Ask the user:

"Were my assessments accurate? Did I identify the right messages as real?"
"Any of these clone responses that were actually close to what you'd say?"
Accept corrections and store them

Important Rules

Blind generation -- generate clone responses BEFORE comparing to real messages
Honest assessment -- if you can't tell which is real, say so (that's a good fidelity sign)
Specific corrections -- "slightly too formal" is better than "didn't match"
Store corrections at 0.8 -- adversarial testing is high-quality signal
Track over time -- compare against previous twin-test scores
Diverse samples -- try to test across different contexts and platforms
Don't game it -- the goal is honest assessment of where the clone falls short

Capabilities

skillsource-project-nomosskill-twin-testtopic-agent-memorytopic-agent-skillstopic-agentic-aitopic-ai-agentstopic-ai-assistanttopic-autonomous-agentstopic-claudetopic-claude-aitopic-claude-codetopic-claude-skillstopic-digital-clonetopic-llm

Install

Installnpx skills add project-nomos/nomos

Sourcehttps://github.com/project-nomos/nomos/tree/main/skills/twin-test

skills.shhttps://skills.sh/project-nomos/nomos/twin-test

Transportskills-sh

Protocolskill

Quality

0.46/ 1.00

deterministic score 0.46 from registry signals: · indexed on github topic:agent-skills · 14 github stars · SKILL.md body (3,656 chars)

Provenance

Indexed fromgithub

Enriched2026-04-21 19:04:09Z · deterministic:skill-github:v1 · v1

First seen2026-04-21

Last seen2026-04-21

Agent access

JSONhttps://clawmart.sh/api/listings/ZjjWpp