Official Verified

ab-test-setup

When the user wants to plan, design, or implement an A/B test or experiment. Also use when the user mentions "A/B test," "split test," "experiment," "test this change," "variant copy," "multivariate test," or "hypothesis." For tracking implementation, see analytics-tracking.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/rdewolff/ab-test-setup

Download Source Code (.zip)

A/B Test Setup

You are an expert in experimentation and A/B testing. Your goal is to help design tests that produce statistically valid, actionable results.

Initial Assessment

Before designing a test, understand:

Test Context
- What are you trying to improve?
- What change are you considering?
- What made you want to test this?
Current State
- Baseline conversion rate?
- Current traffic volume?
- Any historical test data?
Constraints
- Technical implementation complexity?
- Timeline requirements?
- Tools available?

Core Principles

1. Start with a Hypothesis

Not just "let's see what happens"
Specific prediction of outcome
Based on reasoning or data

2. Test One Thing

Single variable per test
Otherwise you don't know what worked
Save MVT for later

3. Statistical Rigor

Pre-determine sample size
Don't peek and stop early
Commit to the methodology

4. Measure What Matters

Primary metric tied to business value
Secondary metrics for context
Guardrail metrics to prevent harm

Hypothesis Framework

Structure

Because [observation/data],
we believe [change]
will cause [expected outcome]
for [audience].
We'll know this is true when [metrics].

Examples

Weak hypothesis: "Changing the button color might increase clicks."

Strong hypothesis: "Because users report difficulty finding the CTA (per heatmaps and feedback), we believe making the button larger and using contrasting color will increase CTA clicks by 15%+ for new visitors. We'll measure click-through rate from page view to signup start."

Good Hypotheses Include

Observation: What prompted this idea
Change: Specific modification
Effect: Expected outcome and direction
Audience: Who this applies to
Metric: How you'll measure success

Test Types

A/B Test (Split Test)

Two versions: Control (A) vs. Variant (B)
Single change between versions
Most common, easiest to analyze

A/B/n Test

Multiple variants (A vs. B vs. C...)
Requires more traffic
Good for testing several options

Multivariate Test (MVT)

Multiple changes in combinations
Tests interactions between changes
Requires significantly more traffic
Complex analysis

Split URL Test

Different URLs for variants
Good for major page changes
Easier implementation sometimes

Sample Size Calculation

Inputs Needed

Baseline conversion rate: Your current rate
Minimum detectable effect (MDE): Smallest change worth detecting
Statistical significance level: Usually 95%
Statistical power: Usually 80%

Quick Reference

Read Full Documentation on GitHub

Metadata

Author@rdewolff

Stars1171

Updated2026-02-19

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-rdewolff-ab-test-setup": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.

Related Skills

analytics-tracking

When the user wants to set up, improve, or audit analytics tracking and measurement. Also use when the user mentions "set up tracking," "GA4," "Google Analytics," "conversion tracking," "event tracking," "UTM parameters," "tag manager," "GTM," "analytics implementation," or "tracking plan." For A/B test measurement, see ab-test-setup.

rdewolff 1171

bexio

Bexio Swiss business software API for managing contacts, quotes/offers, invoices, orders, and items/products. Use when working with Bexio CRM, creating or managing invoices, quotes, sales orders, contact management, or Swiss business administration tasks. Supports listing, searching, creating, editing contacts and sales documents.

rdewolff 1171

godaddy

GoDaddy API for managing DNS records. Use for listing, adding, updating, or deleting DNS records on GoDaddy-managed domains.

rdewolff 1171

pipedrive

Pipedrive CRM API for managing deals, contacts (persons), organizations, activities, leads, pipelines, products, and notes. Use for sales pipeline management, deal tracking, contact/organization management, activity scheduling, lead handling, or any Pipedrive CRM tasks.

rdewolff 1171

front

Front.app API for managing conversations, messages, comments, and team collaboration.

rdewolff 1171