Files
skills-library/skills/meta/skill-tester.md

1.4 KiB

name, description, category, token_budget
name description category token_budget
skill-tester Use when validating a skill before deployment. Tests skill behavior with subagent simulations to find edge cases and rationalization loopholes. meta 1500

Skill: Test Skills with Subagents

Overview

Every skill must be tested before deployment. The Iron Law: No skill ships without a failing test first.

Testing Protocol

Phase 1: Baseline (No Skill)

  1. Run a subagent WITHOUT the skill loaded
  2. Give it a scenario where the skill SHOULD be used
  3. Document what the subagent does wrong
  4. This establishes the "before" behavior

Phase 2: Apply Skill

  1. Run subagent WITH the skill loaded
  2. Same scenario as Phase 1
  3. Verify the skill changes behavior correctly
  4. Document improvements

Phase 3: Edge Cases

Test these rationalization patterns:

  • "I already know how to do this"
  • "This situation is different"
  • "It would be faster to just..."
  • "The user probably meant..."

Phase 4: Adversarial

Attempt to make the skill fail:

  • Ambiguous inputs
  • Conflicting requirements
  • Time pressure scenarios
  • Partial information

Output Format

{
  "skill_name": "...",
  "test_date": "YYYY-MM-DD",
  "baseline_failures": ["..."],
  "improvements": ["..."],
  "edge_cases_passed": N,
  "edge_cases_failed": N,
  "recommendations": ["..."],
  "verdict": "ship|iterate|reject"
}

Token Budget: 1500 tokens max