Marketing TestingA/BExperiments

A/B Test Setup

Plan, design, and implement A/B tests and experiments.

Install for

Real Conversations

See how the A/B Test Setup skill responds to actual requests. These are representative examples of what your AI agent will produce.

You

We have a 'Start Free Trial' CTA on our pricing page converting at 3.2%. I want to test it against 'Get Started Free' and 'Try Pipedream Free'. We get about 4,000 unique visitors/month to that page. How do I set this up properly?

Agent

Test Design: 3-way CTA copy test

Sample size calculation:

Baseline: 3.2% conversion, MDE: 20% relative lift (0.64pp), α=0.05, power=80%
Required per variant: ~7,400 visitors
At 4,000/month split 3 ways (1,333/variant/month): **5.5 months to significance**

Recommendation: Reduce to 2 variants (control vs. best challenger) to reach significance in ~2.8 months.

GTM setup:

Trigger: Page URL contains /pricing
Variable: Custom JS → Math.random() < 0.5 ? 'variant' : 'control'
Tag: DOM manipulation → document.querySelector('.cta-primary').innerText
GA4 Event: experiment_assigned {experiment_id, variant}

What to track:

Primary: CTA click → trial signup (full funnel, not just click)
Secondary: scroll depth, time on page (detect sample pollution)

Guard rails:

Run minimum 2 full business cycles before calling
Check for novelty effect spike in week 1
Segment results by traffic source — paid vs. organic often behave differently

You

Our trial-to-paid conversion is 18%. We think the onboarding is too long — currently 7 steps. We want to test a 3-step version. How do we set this up without breaking our funnel tracking?

Agent

Test: Onboarding length (7-step control vs. 3-step variant)

Assignment strategy: Assign at signup, persist in user DB — not cookies. This prevents variant switching if users log in from multiple devices.

Instrumentation checklist:

GA4 Events to fire:
  onboarding_started        {variant, step_count}
  onboarding_step_complete  {variant, step_number, step_name}
  onboarding_completed      {variant, time_to_complete_seconds}
  trial_converted           {variant, days_since_signup}

Segment gating: Only new signups after launch date. Never expose existing users mid-onboarding.

Primary metric: Trial-to-paid at day 14 and day 30 Guardrail metrics: Support ticket rate, feature adoption depth (users who use ≥3 features)

Risk: A shorter onboarding may boost activation speed but reduce feature discovery. Track 60-day retention, not just conversion.

Timeline: At ~200 signups/month, you need ~370/variant → ~4 months. Consider running paid acquisition during the test to accelerate.

You

We want to A/B test showing annual pricing by default vs. monthly. Currently monthly is default and annual/monthly toggle is above the fold. Our annual plan attach rate is only 22%.

Agent

Test: Default billing toggle position (monthly vs. annual)

Hypothesis: Showing annual pricing by default anchors users to the lower per-month price and increases annual attach rate without hurting overall conversion.

Implementation via GTM:

// On DOMContentLoaded, variant flips toggle state
if (experimentVariant === 'annual-default') {
  document.querySelector('[data-billing="annual"]').click();
  // Fire before paint to avoid flicker — use GTM 'Consent Initialization' trigger
}

Metrics to track:

Metric	Control target	Variant goal
Annual attach rate	22%	35%+
Overall signup CVR	3.2%	≥3.0% (guardrail)
Avg contract value	baseline	+30%+

Anti-flicker: Inject via <head> synchronous tag in GTM, not async. Async causes visible toggle flash.

Call it when: Annual attach rate difference is statistically significant AND overall CVR is not degraded below -5% relative. Run for minimum 3 weeks.

Get Started

How to install and use this skill in your preferred environment.

Skills are designed for AI coding agents (Claude Code, Cursor, Windsurf) and IDE-based workflows where the agent can read files, run scripts, and act on your codebase.

Models & Context

Which AI models and context windows work best with this skill.

Recommended Models

Best

Claude Opus 4.8 Claude Sonnet 4 GPT-4.1 Gemini 2.5 Pro Grok 3 Kimi K2

Good

Claude Haiku 4.5 GPT-4.1 mini Gemini 2.5 Flash Grok 3 mini

Larger models produce more detailed, production-ready outputs.

Context Window

This skill's SKILL.md is typically 3–10 KB — fits in any modern context window.

8K Skill only

32K+ Skill + conversation

100K+ Skill + references + codebase

All current frontier models support 100K+ context.

Pro tips for best results

1

Be specific

Include numbers — users, budget, RPS — so the skill can size the architecture.

2

Share constraints

Compliance needs, team size, and existing stack all improve the output.

3

Iterate

Start with a high-level design, then ask follow-ups for IaC, cost analysis, or security review.

4

Combine skills

Pair with companion skills below for end-to-end coverage.

Ready to try A/B Test Setup?

Install the skill and start getting expert-level guidance in your workflow — any agent, any IDE.

Install for

← Browse all skills

A/B Test Setup

Real Conversations

Get Started

Claude Code, OpenCode, or any CLI agent

Cursor, Windsurf, or IDE-based agents

Claude.ai, ChatGPT, Gemini, or any web AI

Models & Context

Recommended Models

Context Window

Pro tips for best results

Ready to try A/B Test Setup?