NOVOSKY uses one backtest engine for realistic evaluation: backtest_config.py.
| Tool | When to use | Expected WR |
|---|
backtest_config.py | Real performance estimate, after retraining or config changes | ~57–86% (ATR-based) |
Data contamination: default --days 365 overlaps with training data. Always use --oos-only for real OOS results. This flag reads train_cutoff_date from ensemble_btcusd-live_metadata.json and only tests on data after that date.
Config-faithful backtest (recommended)
backtest_config.py reads your actual config.json + ml_config.json and simulates the live bot bar-by-bar. This is the realistic live performance estimate.
# True OOS — always use this flag for real results
python backtest_config.py \
--balance 500 --no-swap --leverage 500 \
--spread 16.95 --oos-only --no-chart
# With realistic lot cap (prevents unlimited compounding)
python backtest_config.py \
--balance 500 --no-swap --leverage 500 \
--spread 16.95 --oos-only --max-lot 1.0 --no-chart
# VT Markets Cent account
python backtest_config.py \
--balance 500 --no-swap --leverage 500 \
--spread 16.95 --oos-only --cent-account \
--max-lot 0.1 --no-chart
# IC Markets RAW spread
python backtest_config.py \
--balance 10000 --days 365 --swap-long 20 \
--leverage 200 --spread 3.0 --no-chart
Key flags
| Flag | Why it matters |
|---|
--oos-only | Always use this. Without it, you’re testing on in-sample data and will see ~97% WR. That is not real. |
--spread | VT Markets BTCUSD = 16.95, IC Markets RAW = 3.0. Wrong value inflates WR by 5–8 percentage points. |
--cent-account | Only if your broker reports balance in USC. VT Markets Cent uses this. |
--max-lot | Caps lot size. Without this, compounding hits max_lot fast and inflates returns. Use 1.0 for realistic results. |
--no-chart | Skip matplotlib output (required for headless/server runs). |
--oos-end DATE | Hard end date for the OOS window (YYYY-MM-DD). Truncates data after this date. Used by the weekly optimizer’s sweep phase to prevent config overfitting — the sweep only sees the first 70% of the OOS window. |
Preset-style comparisons with backtest_config.py
The old fixed-setup wrapper has been removed. Use backtest_config.py directly and vary the inputs you care about:
# Smaller account / conservative cap
python backtest_config.py --balance 200 --no-swap --leverage 500 --spread 16.95 --oos-only --max-lot 0.1 --no-chart
# Growth account / looser cap
python backtest_config.py --balance 1000 --no-swap --leverage 500 --spread 16.95 --oos-only --max-lot 1.0 --no-chart
# Cent account
python backtest_config.py --balance 500 --no-swap --leverage 500 --spread 16.95 --oos-only --cent-account --max-lot 0.1 --no-chart
Interpreting results
WR interpretation
backtest_config.py is the live-expectation tool. Its WR is lower than the old preset wrapper because it uses ATR-based SL/TP and your real config. Use it as the final gate before deployment.
Phase performance history
| Phase | OOS window | WR | PF | MaxDD | Return |
|---|
| 11 | 224d (Sep 2025–Apr 2026) | 57.4% | 2.23 | 50.2% | +56,136% |
| 15 | 37d (Mar–Apr 2026) | 78.5% | 2.43 | 1.8% | +50.7% |
The Phase 15 OOS window is shorter (38 days) — enough to validate post-cutoff behavior but less statistically robust than Phase 11’s 224-day window. Use both reference points when evaluating retraining results.
Score metric
Results are ranked by:
Higher is better. Latest Phase 15 weekly score: 21.34.