Files
skills-library/skills/design-thinking/test.md

5.1 KiB

Skill: Design Thinking - Test

Description

Validate prototypes with real users to gather feedback, measure success, and decide next steps.

Input

  • prototype: Prototype to test (required)
  • test_type: usability|ab_test|interview|analytics (optional, default: usability)
  • user_segments: Which users to test with (required)
  • success_criteria: What defines success (required)

Testing Frameworks

1. Usability Testing

5-User Rule: Test with 5 users to find 85% of usability issues.

Test Script Template:

INTRO (2 mins)
- Thanks for participating
- Testing product, not you
- Think aloud encouraged
- No wrong answers

CONTEXT (1 min)
- Scenario: "You need to create a new campaign..."
- Goal: "Complete the task as you normally would"

TASKS (10-15 mins)
Task 1: [Specific scenario]
- Observe: Where do they pause?
- Note: What do they say?
- Ask: "What are you thinking?"

Task 2: [Next scenario]
- Same observation process

DEBRIEF (5 mins)
- What was easy?
- What was confusing?
- What would you change?
- Would you use this?

What to Observe:

  • Time to complete task
  • Number of errors/retries
  • Hesitation points
  • Verbal confusion
  • Emotional reactions
  • Workarounds attempted

2. Feedback Collection Templates

Quick Rating Scale:

After each task:
"How easy was that? (1-5)"
1 = Very difficult
2 = Difficult
3 = Okay
4 = Easy
5 = Very easy

System Usability Scale (SUS):

Rate 1-5 (1=Strongly Disagree, 5=Strongly Agree):
1. I would use this frequently
2. I found it unnecessarily complex
3. I found it easy to use
4. I would need support to use this
5. Features were well integrated
6. There was too much inconsistency
7. Most people would learn quickly
8. I found it cumbersome
9. I felt confident using it
10. I needed to learn a lot first

Score: ((Sum odd items - 5) + (25 - sum even items)) * 2.5
> 68 = Above average
< 68 = Below average

Post-Task Interview:

1. What did you like most?
2. What frustrated you?
3. What would you change?
4. Would you use this over current solution?
5. What's missing?

3. A/B Testing Patterns

Simple A/B Test:

{
  "test_name": "Campaign Creation Flow",
  "hypothesis": "1-step flow will increase completion by 30%",
  "variant_a": {
    "name": "Control (3-step)",
    "users": 50,
    "metric": "completion_rate"
  },
  "variant_b": {
    "name": "Treatment (1-step)",
    "users": 50,
    "metric": "completion_rate"
  },
  "duration": "7 days",
  "success_threshold": "30% improvement"
}

Metrics to Track:

  • Completion rate
  • Time to complete
  • Error rate
  • Drop-off points
  • User satisfaction

4. Iteration Decision Criteria

Go/No-Go Framework:

SHIP IT if:
- [ ] 80%+ users complete task successfully
- [ ] Average task time meets target
- [ ] SUS score > 68
- [ ] No critical usability issues
- [ ] Users prefer it to current solution

ITERATE if:
- [ ] 50-80% success rate
- [ ] Task time 2x target
- [ ] SUS score 50-68
- [ ] 2+ moderate issues
- [ ] Mixed user preference

PIVOT if:
- [ ] <50% success rate
- [ ] Task time >3x target
- [ ] SUS score <50
- [ ] Critical blocking issues
- [ ] Users prefer current solution

Output Format

{
  "status": "success",
  "test_summary": {
    "type": "usability_test",
    "users_tested": 5,
    "date": "2024-12-14",
    "duration": "45 minutes total"
  },
  "results": {
    "completion_rate": "80%",
    "avg_time": "8 seconds",
    "sus_score": 72,
    "user_satisfaction": "4.2/5"
  },
  "key_findings": [
    {
      "issue": "Users confused by template dropdown",
      "severity": "moderate",
      "users_affected": 3,
      "evidence": "Hesitated 5+ seconds, said 'not sure what to pick'"
    },
    {
      "issue": "Success message not clear",
      "severity": "low",
      "users_affected": 2,
      "evidence": "Asked 'did it work?'"
    }
  ],
  "positive_feedback": [
    "Much faster than current process",
    "Templates are helpful",
    "Clean and simple interface"
  ],
  "improvement_suggestions": [
    "Add template preview on hover",
    "Show success confirmation clearly",
    "Save last-used template as default"
  ],
  "decision": {
    "verdict": "iterate",
    "reasoning": "80% success rate is good but template confusion needs fix",
    "next_actions": [
      "Add template previews",
      "Improve success feedback",
      "Test again with 3 users"
    ]
  },
  "next_step": "Implement improvements and /dt loop back to testing"
}

Quality Gates

  • Tested with 5+ users per segment
  • Clear success/failure criteria defined
  • Quantitative data collected (time, completion, SUS)
  • Qualitative feedback captured (quotes, observations)
  • Decision made (ship/iterate/pivot)
  • Next actions defined

Token Budget

  • Max input: 800 tokens
  • Max output: 1800 tokens

Model

  • Recommended: sonnet (pattern analysis)

Philosophy

"In God we trust. All others must bring data." Test with users, not assumptions.

Keep it simple:

  • 5 users find 85% of issues
  • Watch what they do, not what they say
  • Quantify where possible
  • Act on feedback fast
  • Test early, test often