In partnership with

Help My Newsletter Logo

HELP MY NEWSLETTER

Triage for sick newsletters that should be printing money

Causal Truth

Testing

Beehiiv Ops

Next Tuesday edition

Your growth metrics are lying to you. Here is how to prove what actually works.

Most newsletter operators scale tactics based on vibes. They change something, a number moves, and they call it a win. Then they build an entire 2026 plan on a false signal.

This edition is your upgrade from “I think this works” to “I can defend this with evidence.”

If you can not answer “what would have happened if I did nothing,” you do not know the impact. You have a story.

This edition covers: the most common measurement lies in newsletter land, the one question that fixes it, and a simple Beehiiv-ready testing setup that gives you real answers in 30 days.

Part 1

The most dangerous sentence in newsletter land

“Ever since we did X, our clicks went up.”

That sentence sounds smart. It feels like momentum. It also destroys decision-making. Because when you “did the thing,” other things changed too.

The 4 usual suspects

Seasonality
People behave differently in January than they do in August.

Platform noise
A social algorithm bump looks like a “tactic win.”

Audience quality
High-intent subscribers click more no matter what you send.

You changed multiple things
Subject line, onboarding, send time, format. Which one worked?

If you are honest, most “wins” are a cocktail. You can not scale a cocktail with confidence.

Part 2

The one question that fixes almost everything

Stop asking: “What happened after I did this?”

Start asking:

“What would have happened if I had not done it?”

That is the shadow baseline (what would have happened anyway). It is the missing twin world you never see. And it is the only way to know if your tactic caused the result.

Translate this into newsletter language

Bad: “We rewrote onboarding and clicks went up.”

Good: “Among new subscribers from the same source, what is the difference in 30-day click rate between those who get Welcome A and those who get Welcome B?”

If you can not name the comparison group, you are not measuring a causal effect. You are watching a chart.

Free, private email that puts your privacy first

A private inbox doesn’t have to come with a price tag—or a catch. Proton Mail’s free plan gives you the privacy and security you expect, without selling your data or showing you ads.

Built by scientists and privacy advocates, Proton Mail uses end-to-end encryption to keep your conversations secure. No scanning. No targeting. No creepy promotions.

With Proton, you’re not the product — you’re in control.

Start for free. Upgrade anytime. Stay private always.

Part 3

The Beehiiv-ready testing setup (no PhD required)

You can run clean causal tests in Beehiiv. The trick is to stop treating your newsletter like a diary and start treating it like an experiment.

Step 1: Create a holdout
Some new subscribers must not receive the new sequence. No holdout means no shadow baseline (what would have happened anyway).

Step 2: Change one thing
If you rewrite onboarding, do not also change send time, topic mix, and CTAs.

Step 3: Lock a 30-day window
Decide upfront what “success” means and measure it the same way for everyone.

 

Industry Reality Check

Why most teams avoid experiments (and why that hurts growth)

Harvard Business Review documented something most operators feel but rarely say out loud: teams love experimenting on products, but they avoid experimenting on acquisition.

  • 90%+ of growth teams test websites and apps.
  • Only about 1 in 8 run randomized ad experiments.
  • Teams that do experiment outperform peers year over year.

The reason is not ignorance.
It is fear. Holdouts feel like wasted reach. Evidence threatens opinions. Legacy dashboards feel safer than admitting uncertainty.

What high-performing teams do differently

  • They normalize holdouts as the cost of truth.
  • They give experimentation executive cover so data can override opinions.
  • They treat testing as a system, not a one-off tactic.

Newsletter translation:
If you are not testing your acquisition engine, you are guessing at the most expensive part of your business.

Start small. Allocate 10% of your growth effort to experiments you trust. Scale only what survives contact with evidence.

 

A simple wiring plan

[ ] Create two signup pages or forms: A and B

[ ] Route traffic 50/50 to A or B (landing page tool or manual rotation)

[ ] Tag subscribers automatically: welcome_a or welcome_b

[ ] Create a tiny holdout: welcome_holdout (5% to 10%)

[ ] Compare outcomes after 30 days: clicks, replies, paid conversions, churn

Do not do this

Do not “adjust” your results by filtering for people who opened Email 1 or clicked Email 2. Those actions happen after treatment and will trick you into deleting the very effect you are trying to measure.

Part 7

When you cannot run a clean holdout, use a shadow baseline forecast (what would have happened anyway)

Sometimes you cannot split people perfectly in Beehiiv. That does not mean you are stuck guessing. You can build a shadow baseline for your metric using other signals that usually move with it, then compare that baseline to what actually happened.

The idea in one line:
Build a shadow timeline for what your metric would have been if you did nothing. The gap between that prediction and reality is your best estimate of impact.

The operator version of the method

You pick one metric, then you pick a few control signals that usually move with it but should not be directly changed by your intervention. The model uses those signals to predict the no-change world.

  • Pick a flow metric: daily clicks, daily signups, or daily revenue. Do not start with total subscribers.
  • Choose 3 to 10 control signals: site sessions, organic search impressions, baseline clicks on evergreen links, YouTube views, referral traffic, or platform-wide trend proxies.
  • Compare observed vs predicted: if the observed line sits inside the wiggle room range, you do not have a result yet. You have a hypothesis.

A warning most operators miss

Cumulative impact is great for flows, but it gets weird for stock metrics like total subscribers. If you want to talk about subscriber count, use a running average lift or a per-day lift instead.

The tool most people use for this is called CausalImpact. You do not need to code to understand the logic. You need to stop pretending a spike proves a cause.

A simple power reality check

Pick a fake launch date in the past and run the same analysis. If your chart is noisy, your newsletter needs a bigger effect size before you can confidently declare a win. This saves you from scaling a mirage.

Part 4

Your 7-day sprint: turn guessing into knowing

This takes less time than most operators spend arguing about open rates.

Day 1: Write the “what if”

[ ] “What is the 30-day clicker rate difference between Welcome A and Welcome B?”

Day 2: Define time zero

[ ] Time zero is the moment a subscriber joins. Not the moment you notice a dip.

Day 3: Lock the metric

[ ] Pick one primary outcome. Example: “Any click in the first 30 days.”

Days 4 to 30: Do nothing

[ ] No mid-test edits. No “quick improvements.” Let the test speak.

Day 31: Make the call

[ ] Scale the winner. Kill the loser. Write down what you learned.

The operators who win in 2026 are not the loudest. They are the ones who can explain why something worked.

Need me to set this up with you?

I will design your holdout test and measurement plan

One 60-minute session. We pick the one lever that matters, define the shadow baseline (what would have happened anyway), set up tags and cohorts, and lock a clean 30-day scorecard.

Book Causal Audit

Bring your last 90 days of metrics and one hypothesis you want to prove

Last thing

Forward this to the operator who keeps “winning” without proof

Know someone who changes three things, sees a spike, and declares a new strategy? Send them this. Know someone who says “testing does not work for me”? Send them this.

Truth compounds faster than tactics. The goal is not a “win.” The goal is a believable shadow baseline (what would have happened anyway).

Hit reply and tell me: what is one thing you are currently assuming works?

Help My Newsletter · Inbox triage for people who would rather know than guess

Keep Reading