Week one: the audit
The audit is the artifact that decides whether the sprint is going to work. It is a single document that maps the funnel by step, attaches a baseline conversion rate to each step, and flags the leaks ranked by impact and effort.
We pull from Shopify analytics, GA4, and session replay. The session replay is where the real findings come from. Numbers tell you something is wrong. Replays tell you why. Plan to watch 30 to 50 sessions in week one, biased to the highest-traffic step with the biggest gap to benchmark.
Week two: the hypothesis backlog
Every observation from week one becomes a hypothesis. The hypothesis is structured: if we change X on Y page for Z audience, then this metric will move by this amount, because this is the friction we observed.
We rank by impact and effort. Impact is potential lift on the metric that matters. Effort is engineering time. The top three to five hypotheses go into the sprint. The rest go into a parking lot the customer keeps after the engagement.
Week three: ship to traffic
Every test runs on at least 50 percent of relevant traffic, never on a vanity audience. We use the customer's existing A/B platform if they have one, or we wire up a lightweight server-side split if they do not.
The test spec is short. One hypothesis, one variant, one primary metric, one secondary metric, one guardrail. Anything more than that and the readout becomes a debate instead of a decision.
Week four: the readout
The readout is a single page per test. Hypothesis at the top. Result in the middle, expressed as a percentage lift with a 95 percent confidence interval. Decision at the bottom: ship, kill, or iterate.
Most readouts are too long. The team does not need the methodology spelled out for every test. They need to know what shipped, what did not, and what is going into the next round. Keep it tight, keep it honest, and keep the parking lot updated.
After the sprint
The sprint ends with a closed test backlog, a parking lot, a list of wins to roll forward, and a list of patterns to monitor. The customer owns all of it. If they want a second sprint, the parking lot is the starting point and the audit is already done.
We have seen customers run two or three sprints back to back and others go quiet for six months. Both are fine. The sprint is a tool, not a subscription.