Skip to main content
All results are out-of-sample (OOS). The OOS window is never used to fit hyperparameters — it is a blind evaluation on data the models have not seen.

Current results

Phase 16 — April 2026
MetricValue
OOS period38 days (2026-03-17 to 2026-04-24)
Win rate62.8%
Profit factor2.26
Max drawdown14.5%
Sharpe ratio37.88
Return+1154.6%
Trades156
Score46.80
Configconf=0.633, risk=3.0%, TP=1.2×ATR, SL=1.0×ATR

Performance history

DateScoreWRPFMaxDDReturnTradesConfig summary
2026-04-2446.8062.8%2.2614.5%+1154.6%156conf=0.633, risk=3.0%, TP=1.2×ATR
2026-04-2421.3465.8%3.037.1%+390.9%114Phase 16 fixes (RF weight, TP<SL)
2026-04-247.7750.3%1.713.6%+71.5%344conf=0.633, risk=2.0%, TP=0.8×ATR
2026-04-246.7050.7%1.614.1%+66.5%343conf=0.633, risk=2.5%, TP=0.8×ATR
2026-04-2021.3478.5%2.431.8%+50.7%274conf=0.6, risk=2.5%, TP=0.8×ATR
2026-04-194.8467.4%1.383.7%+51.6%717conf=0.6, risk=2.5%, TP=0.8×ATR
2026-04-184.8271.2%1.877.6%+323.0%483conf=0.6, risk=1.0%, TP=0.8×ATR

Score formula

The optimizer maximizes a single composite score that balances return against risk:
score = profit_factor × sharpe × (1 - max_drawdown) × log(1 + trades)
A score improvement of ≥ 2% over baseline is required to commit new models. This prevents the pipeline from committing marginal or noise-driven improvements. A new week’s baseline is the score recorded at the end of the previous week’s run.

OOS validation methodology

1

Data split

The dataset is split into in-sample (IS) and out-of-sample (OOS) segments. The IS window is used for training and Optuna tuning. The OOS window is never touched during training.
2

Sweep on first 70%

The config sweep (scripts/sweep.py) runs only on the first 70% of the OOS window. This is the “search” OOS.
3

Final eval on full OOS

After the best config is identified, the full OOS window is backtested. The last 30% of OOS data — the true holdout — is never seen during the sweep.
4

Rollback check

If the full-OOS score does not beat the stored baseline by ≥ 2%, all models and configs roll back to the previous version.
Always use python backtest.py --oos-only for realistic validation. Avoid generic lookback backtests — they are subject to in-sample contamination.

Reading the backtest output

python backtest.py --balance 500 --no-swap --leverage 500 --spread 16.95 --oos-only --no-chart
Key output fields:
FieldWhat it means
win_rate% of trades closed profitably
profit_factorGross profit ÷ gross loss — above 1.5 is good
max_drawdownLargest peak-to-trough equity drop during the OOS period
sharpeRisk-adjusted return (annualized)
returnTotal % return over the OOS period
scoreComposite score used by the optimizer

Hardware note

Backtest results shown here used:
  • Spread: 16.95 pts (VT Markets BTCUSD typical)
  • Leverage: 500
  • Starting balance: 500 USC
  • Swap: disabled (VT Markets cent account)
Results on other brokers or with swap enabled will differ.