Skillquality 0.45

recipe-eval-prompt

Compares original and optimized prompts by parallel execution in git worktrees. Use when evaluating prompt improvement effects or learning prompt engineering through concrete examples.

Price
free
Protocol
skill
Verified
no

What it does

Prompt Evaluation

Orchestrator Definition

Purpose: Provide accurate feedback on prompt optimization effects, enabling users to learn effective prompting through concrete comparison results.

Core Identity: "I route information between specialized agents. I pass user input to analyzers. I present agent outputs to users."

Pass-through Principle: User requests flow directly to agents. Agent outputs flow directly to users. Both prompts execute under identical conditions.

Execution Protocol:

  1. Delegate all work to sub-agents (orchestrator role only)
  2. Register all steps via TaskCreate before starting, update status via TaskUpdate upon completion

Phase Boundaries

No user confirmation required between phases unless explicitly requested. Each phase must complete all required outputs before proceeding.

Input

The user provides a natural language request. Pass it directly to prompt-analyzer.

Exception: If the request lacks any identifiable target (no file, function, or scope mentioned at all), ask ONE question to establish scope, then pass through.

Extended timeout: If the user mentions needing more time, use up to 1800 seconds (default: 300 seconds)

Execution Flow

Task Registration: Register execution steps via TaskCreate and proceed systematically

Step 1. Run Required Skills

Run worktree-execution skill.

Step 2. Prompt Analysis and Optimization

Invoke: prompt-analyzer agent

Input:

  • User's exact request text

Output:

  • Analysis results (detected patterns)
  • Optimized prompt
  • Applied optimizations list

Quality Gate:

  • Input contains user's request text only
  • Output presented to user matches agent's output

Step 3. Execution Environment Setup

Execute environment setup per worktree-execution skill "Creation" section.

Step 4. Parallel Execution

Invoke: Two prompt-executor agents simultaneously (single message, parallel Task calls)

Subagent 1:
  agent: prompt-executor
  working_directory: {worktree_original_path}
  prompt: {original_request}

Subagent 2:
  agent: prompt-executor
  working_directory: {worktree_optimized_path}
  prompt: {optimized_request}

Each subagent executes the prompt as a development task within its isolated worktree.

CRITICAL: Both Task tool calls MUST be in the same message to achieve true parallel execution.

Step 5. Environment Cleanup

Execute worktree cleanup per worktree-execution skill "Cleanup" section.

Step 6. Report Generation

Invoke: report-generator agent

Input:

  • Original and optimized prompts
  • Execution results from both subagents
  • Applied optimizations list

Output:

  • Comparison report (markdown)
  • Improvement classification (structural / context addition / expressive / variance)

Quality Gate:

  • Output presented to user matches agent's output

Step 7. Retrospective

Trigger: Report generation completes

Action: Ask user for feedback on comparison results, then delegate to knowledge-optimizer agent

Improvement Classification

Apply the execution quality criteria from the prompt-optimization skill.

ClassificationDefinitionInterpretation
StructuralPrompt structure, clarity, specificity improvementsPrompt writing technique
Context AdditionProject-specific information added from codebase investigationInformation advantage
ExpressiveDifferent phrasing, equivalent substanceNeutral
VarianceWithin LLM probabilistic varianceOriginal prompt sufficient

Key Principle: Distinguish between prompt writing improvements (Structural) and information additions (Context Addition).

Final Output to User

Present report-generator's complete output to user. Optimized prompt must appear in full. This is the core learning value of the report.

The report includes (defined in report-generator):

  • Input Prompts (original and optimized full text)
  • Optimizations Applied
  • Execution Results
  • Comparison Analysis
  • Learning Points

Error Handling

ScenarioBehavior
One subagent failsContinue with successful result, report as "partial"
Both subagents failReport full failure with diagnostics
TimeoutTerminate, capture partial results, cleanup
Worktree creation failsReport git error, suggest checking repository state

Prerequisites

  • Git repository (git 2.5+ for worktree support)
  • Claude Code subagent execution permissions
  • Sufficient disk space for worktree copies

Usage Examples

/recipe-eval-prompt
Add error handling to generateResponse in geminiService.ts. Handle 429, timeout, and invalid responses.
/recipe-eval-prompt
Generate code following this skill: .claude/skills/my-skill/SKILL.md

For complex tasks:

/recipe-eval-prompt
Refactor the message pipeline for readability. This may take a while.

Capabilities

skillsource-shinprskill-recipe-eval-prompttopic-agent-skillstopic-ai-toolstopic-claude-codetopic-claude-code-plugintopic-developer-toolstopic-evaluationtopic-llmtopic-prompt-engineeringtopic-prompt-evaluationtopic-prompt-optimizationtopic-skills

Install

Installnpx skills add shinpr/rashomon
Transportskills-sh
Protocolskill

Quality

0.45/ 1.00

deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 9 github stars · SKILL.md body (4,961 chars)

Provenance

Indexed fromgithub
Enriched2026-04-24 07:03:39Z · deterministic:skill-github:v1 · v1
First seen2026-04-23
Last seen2026-04-24

Agent access