EVALUATION · 7 MIN READ

How to Read a Backtest

Every Soomario product page publishes backtest figures. This page exists to make sure those figures inform your decision instead of misleading it.

What a Backtest Actually Is

A backtest simulates how a strategy would have performed if it had been running on historical market data. The simulator steps through historical prices one bar at a time, applies the strategy's rules, opens and closes positions, and records the results. The output is a track record: trade-by-trade P&L, equity curve, summary statistics.

What a backtest is not is a guarantee of future results. It is a description of how the rules would have behaved in the past. The relationship between past simulation and future live performance is the core question every reader of a backtest needs to wrestle with — and the answer is rarely "they'll be the same."

The Core Metrics, Honestly

Win Rate

The fraction of trades that close profitably. A 70% win rate sounds great until you realise a strategy can have a 90% win rate and still lose money — if the 10% of losing trades are very large and the 90% of winners are tiny. Win rate alone is meaningless. Always read it alongside profit factor and average win/loss size.

Mean-reversion strategies (Max Pain, Zones) tend to have high win rates because they enter at exhaustion points where small bounces are common; their profitability depends on the rare large losses being smaller than the cumulative small wins. Momentum strategies often have lower win rates with larger average wins. Neither shape is "better" — they are different shapes.

Profit Factor

Total gross profit divided by total gross loss across all trades.

Profit Factor = Sum of all winning trades / Sum of all losing trades

A profit factor of 1.0 is breakeven before fees. 1.5 is acceptable. 2.0 is strong. 3.0+ is rare and worth a sceptical look — either the sample is small, the strategy is genuinely excellent, or the backtest is overfit. Profit factor is the single most useful single number on most backtests because it captures the relationship between average win size and average loss size in one figure.

Sharpe Ratio

Annualised return divided by annualised standard deviation of returns. Roughly: how much excess return are you earning per unit of volatility?

Above 1.0 is a real strategy. Above 2.0 is excellent. Above 3.0 is exceptional. Above 5.0 deserves scepticism — either the strategy is genuinely great, the sample is too short, or the backtest excludes important risks. The market regime matters: a strategy with Sharpe 4 over a one-year sample that covers only a bull market may have Sharpe 0.5 over a five-year sample that covers a crash.

Sortino Ratio

Like Sharpe, but it only penalises downside volatility. Upside volatility (winning a lot) doesn't hurt your Sortino the way it hurts your Sharpe. Sortino is usually higher than Sharpe for the same strategy. Read both: a wide gap between Sharpe and Sortino tells you the strategy's volatility is mostly upside, which is generally a good sign.

Maximum Drawdown

The largest peak-to-trough loss in the equity curve. If your account hits $10,000 and then declines to $7,800 before recovering, your max drawdown is 22%. Max drawdown is the most psychologically important metric: it tells you how bad it can get in real time, and whether you'll panic-withdraw before recovery.

A strategy with 50% annualised return and 30% max drawdown is not the same as a strategy with 50% return and 8% drawdown. The first one will test your conviction. The second one won't. Always assume live drawdowns can be 1.5×–2× backtested drawdowns — slippage, regime changes, and out-of-sample volatility usually make live worse than backtested.

Max drawdown is the metric that matters most when sizing your deposit. If a 30% drawdown would force you to withdraw, don't allocate capital that you would withdraw at 30% down. Sizing your deposit such that the worst-case drawdown is tolerable is the single highest-leverage risk control available to a depositor.

The Biases That Make Backtests Lie

Survivorship Bias

If a backtest tests a strategy on the current top-50 cryptocurrencies, it has secretly told the strategy which coins survived. The coins that flamed out and got delisted aren't in the test. A long-only strategy on "current top-50" looks fantastic; the same strategy on "top-50 as of three years ago, including the ones that died" looks much worse. Honest backtests use the asset universe as it existed at the time, including the failures.

Look-Ahead Bias

The strategy uses information that wouldn't have been available at the time of the trade. Common forms: using a daily close to make an intraday decision, using restated financial data instead of original, using indicators calculated on the full dataset rather than walked forward. Look-ahead bias makes backtests look much better than they are. The fix is to ensure every signal at time T is computed only from data available at time T.

Overfitting

Tuning the strategy's parameters until they fit the historical data perfectly. A backtest with 200 parameters tuned to one specific historical period will look amazing on that period and fail completely out-of-sample. The defence is parameter parsimony (fewer knobs), out-of-sample testing (hold back data, tune on the rest, then test on the held-back portion), and walk-forward validation (re-tune as you walk forward through time, simulating the ongoing recalibration a real strategy would do).

Regime Bias

Even an honest, well-constructed backtest is only as informative as the range of market regimes it covers. A backtest of a long-bias crypto strategy that covers 2020–2024 (mostly bullish) will overstate the strategy's robustness; a backtest of the same strategy that includes 2018, 2022, and the 2025 corrections will give a more sober picture. Look for multi-cycle coverage: at least one bull market, one bear market, and one extended chop period.

Slippage and Fee Realism

Backtests that don't account for taker fees, funding payments, slippage, and partial fills will overstate live results. Soomario's published backtests include taker fees (commission-inclusive), funding rates where applicable, and conservative slippage assumptions — but verify this on any backtest you read elsewhere. A backtest that doesn't disclose its fee assumptions probably doesn't have honest ones.

Sample Size Matters

A backtest with 50 trades is anecdotal. With 200 trades, it is suggestive. With 500+ trades, the summary statistics start to be reasonably stable. With 2,000+ trades, you can begin to trust the win rate and profit factor as estimates of the underlying distribution.

Soomario's Max Pain backtest, for example, uses 693 trades over 1,184 days across 18 coins. Statistics drawn from a sample that size are meaningfully more reliable than statistics drawn from a 90-day paper-trading sample of 30 trades — even if the 30-trade sample looks better. Always read sample size before reading the headline numbers.

Live Performance vs Backtest

Live performance tends to underperform backtests. The honest expectation is that headline metrics will degrade in roughly this pattern: returns 60–80% of backtest, drawdowns 1.5×–2× backtest, Sharpe 60–75% of backtest. Strategies that hold up close to their backtest in live trading are the rare ones; assume your strategy will be in the typical group, not the rare one.

This is why Soomario publishes both backtested results and live vault metrics on every product page. The gap between them tells you something the backtest alone can't: whether the strategy survives the messy reality of live execution.

A Practical Framework

When you read a backtest on any strategy — Soomario's or anyone else's — answer these in order before you trust the headline number:

1. How many trades? Below 200, treat as anecdotal. Below 500, treat as preliminary.

2. What time period? Does it cover at least one bull, one bear, and one chop regime? If not, the strategy is untested in the regime it hasn't seen.

3. What's the max drawdown? Could you tolerate 1.5× that drawdown live without withdrawing? If not, size down or pick a different strategy.

4. What's the profit factor? If win rate is high but profit factor is below 1.3, the strategy is fragile. If profit factor is above 3 with a small sample, be sceptical.

5. Are fees included? If the disclosure doesn't say "commission-inclusive" or "after fees," assume the headline number is overstated.

6. Is there live data? A 6-month live track record from a 5-year backtest is more informative than the backtest alone. Ratio them: if live is 70% of backtest, that's normal; if live is 30% of backtest, the strategy is degrading.

The bottom line: backtests are useful as an evidence base, not a promise. Read them with the same scepticism you'd bring to any historical claim — and weight live data more heavily once it exists.

Risk Management → Compare Strategies Glossary