Skillquality 0.45

recipe-eval-prompt

Compares original and optimized prompts by parallel execution in git worktrees. Use when evaluating prompt improvement effects or learning prompt engineering through concrete examples.

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/shinpr/rashomon/recipe-eval-prompt

What it does

Prompt Evaluation

Orchestrator Definition

Purpose: Provide accurate feedback on prompt optimization effects, enabling users to learn effective prompting through concrete comparison results.

Core Identity: "I route information between specialized agents. I pass user input to analyzers. I present agent outputs to users."

Pass-through Principle: User requests flow directly to agents. Agent outputs flow directly to users. Both prompts execute under identical conditions.

Execution Protocol:

Delegate all work to sub-agents (orchestrator role only)
Register all steps via TaskCreate before starting, update status via TaskUpdate upon completion

Phase Boundaries

No user confirmation required between phases unless explicitly requested. Each phase must complete all required outputs before proceeding.

Input

The user provides a natural language request. Pass it directly to prompt-analyzer.

Exception: If the request lacks any identifiable target (no file, function, or scope mentioned at all), ask ONE question to establish scope, then pass through.

Extended timeout: If the user mentions needing more time, use up to 1800 seconds (default: 300 seconds)

Execution Flow

Task Registration: Register execution steps via TaskCreate and proceed systematically

Step 1. Run Required Skills

Run worktree-execution skill.

Step 2. Prompt Analysis and Optimization

Invoke: prompt-analyzer agent

Input:

User's exact request text

Output:

Analysis results (detected patterns)
Optimized prompt
Applied optimizations list

Quality Gate:

Input contains user's request text only
Output presented to user matches agent's output

Step 3. Execution Environment Setup

Execute environment setup per worktree-execution skill "Creation" section.

Step 4. Parallel Execution

Invoke: Two prompt-executor agents simultaneously (single message, parallel Task calls)

Subagent 1:
  agent: prompt-executor
  working_directory: {worktree_original_path}
  prompt: {original_request}

Subagent 2:
  agent: prompt-executor
  working_directory: {worktree_optimized_path}
  prompt: {optimized_request}

Each subagent executes the prompt as a development task within its isolated worktree.

CRITICAL: Both Task tool calls MUST be in the same message to achieve true parallel execution.

Step 5. Environment Cleanup

Execute worktree cleanup per worktree-execution skill "Cleanup" section.

Step 6. Report Generation

Invoke: report-generator agent

Input:

Original and optimized prompts
Execution results from both subagents
Applied optimizations list

Output:

Comparison report (markdown)
Improvement classification (structural / context addition / expressive / variance)

Quality Gate:

Output presented to user matches agent's output

Step 7. Retrospective

Trigger: Report generation completes

Action: Ask user for feedback on comparison results, then delegate to knowledge-optimizer agent

Improvement Classification

Apply the execution quality criteria from the prompt-optimization skill.

Classification	Definition	Interpretation
Structural	Prompt structure, clarity, specificity improvements	Prompt writing technique
Context Addition	Project-specific information added from codebase investigation	Information advantage
Expressive	Different phrasing, equivalent substance	Neutral
Variance	Within LLM probabilistic variance	Original prompt sufficient

Key Principle: Distinguish between prompt writing improvements (Structural) and information additions (Context Addition).

Final Output to User

Present report-generator's complete output to user. Optimized prompt must appear in full. This is the core learning value of the report.

The report includes (defined in report-generator):

Input Prompts (original and optimized full text)
Optimizations Applied
Execution Results
Comparison Analysis
Learning Points

Error Handling

Scenario	Behavior
One subagent fails	Continue with successful result, report as "partial"
Both subagents fail	Report full failure with diagnostics
Timeout	Terminate, capture partial results, cleanup
Worktree creation fails	Report git error, suggest checking repository state

Prerequisites

Git repository (git 2.5+ for worktree support)
Claude Code subagent execution permissions
Sufficient disk space for worktree copies

Usage Examples

/recipe-eval-prompt
Add error handling to generateResponse in geminiService.ts. Handle 429, timeout, and invalid responses.

/recipe-eval-prompt
Generate code following this skill: .claude/skills/my-skill/SKILL.md

For complex tasks:

/recipe-eval-prompt
Refactor the message pipeline for readability. This may take a while.

Capabilities

skillsource-shinprskill-recipe-eval-prompttopic-agent-skillstopic-ai-toolstopic-claude-codetopic-claude-code-plugintopic-developer-toolstopic-evaluationtopic-llmtopic-prompt-engineeringtopic-prompt-evaluationtopic-prompt-optimizationtopic-skills

Install

Installnpx skills add shinpr/rashomon

Sourcehttps://github.com/shinpr/rashomon/tree/main/skills/recipe-eval-prompt

skills.shhttps://skills.sh/shinpr/rashomon/recipe-eval-prompt

Transportskills-sh

Protocolskill

Quality

0.45/ 1.00

deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 9 github stars · SKILL.md body (4,961 chars)

Provenance

Indexed fromgithub

Enriched2026-04-24 07:03:39Z · deterministic:skill-github:v1 · v1

First seen2026-04-23

Last seen2026-04-24

Agent access

JSONhttps://clawmart.sh/api/listings/GZraXA