Performance

All results are out-of-sample (OOS). The OOS window is never used to fit hyperparameters — it is a blind evaluation on data the models have not seen.

Current results

Phase 16 — April 2026

Metric	Value
OOS period	38 days (2026-03-17 to 2026-04-24)
Win rate	62.8%
Profit factor	2.26
Max drawdown	14.5%
Sharpe ratio	37.88
Return	+1154.6%
Trades	156
Score	46.80
Config	conf=0.633, risk=3.0%, TP=1.2×ATR, SL=1.0×ATR

Performance history

Date	Score	WR	PF	MaxDD	Return	Trades	Config summary
2026-04-24	46.80	62.8%	2.26	14.5%	+1154.6%	156	conf=0.633, risk=3.0%, TP=1.2×ATR
2026-04-24	21.34	65.8%	3.03	7.1%	+390.9%	114	Phase 16 fixes (RF weight, TP<SL)
2026-04-24	7.77	50.3%	1.71	3.6%	+71.5%	344	conf=0.633, risk=2.0%, TP=0.8×ATR
2026-04-24	6.70	50.7%	1.61	4.1%	+66.5%	343	conf=0.633, risk=2.5%, TP=0.8×ATR
2026-04-20	21.34	78.5%	2.43	1.8%	+50.7%	274	conf=0.6, risk=2.5%, TP=0.8×ATR
2026-04-19	4.84	67.4%	1.38	3.7%	+51.6%	717	conf=0.6, risk=2.5%, TP=0.8×ATR
2026-04-18	4.82	71.2%	1.87	7.6%	+323.0%	483	conf=0.6, risk=1.0%, TP=0.8×ATR

Score formula

The optimizer maximizes a single composite score that balances return against risk:

score = profit_factor × sharpe × (1 - max_drawdown) × log(1 + trades)

A score improvement of ≥ 2% over baseline is required to commit new models. This prevents the pipeline from committing marginal or noise-driven improvements. A new week’s baseline is the score recorded at the end of the previous week’s run.

OOS validation methodology

Data split

The dataset is split into in-sample (IS) and out-of-sample (OOS) segments. The IS window is used for training and Optuna tuning. The OOS window is never touched during training.

Sweep on first 70%

The config sweep (scripts/sweep.py) runs only on the first 70% of the OOS window. This is the “search” OOS.

Final eval on full OOS

After the best config is identified, the full OOS window is backtested. The last 30% of OOS data — the true holdout — is never seen during the sweep.

Rollback check

If the full-OOS score does not beat the stored baseline by ≥ 2%, all models and configs roll back to the previous version.

Always use python backtest.py --oos-only for realistic validation. Avoid generic lookback backtests — they are subject to in-sample contamination.

Reading the backtest output

python backtest.py --balance 500 --no-swap --leverage 500 --spread 16.95 --oos-only --no-chart

Key output fields:

Field	What it means
`win_rate`	% of trades closed profitably
`profit_factor`	Gross profit ÷ gross loss — above 1.5 is good
`max_drawdown`	Largest peak-to-trough equity drop during the OOS period
`sharpe`	Risk-adjusted return (annualized)
`return`	Total % return over the OOS period
`score`	Composite score used by the optimizer

Hardware note

Backtest results shown here used:

Spread: 16.95 pts (VT Markets BTCUSD typical)
Leverage: 500
Starting balance: 500 USC
Swap: disabled (VT Markets cent account)

Results on other brokers or with swap enabled will differ.

Overview

Setup

System

Current results

Performance history

Score formula

OOS validation methodology

Reading the backtest output

Hardware note

Overview

Setup

System

​Current results

​Performance history

​Score formula

​OOS validation methodology

​Reading the backtest output

​Hardware note

Current results

Performance history

Score formula

OOS validation methodology

Reading the backtest output

Hardware note