Roadmap - NOVOSKY Docs

Current state — Phase 17.3

Dimension	Value
Signal model	62 features · RF+XGB+LGB · 3-class · M15 · Optuna-tuned · recency-weighted
OOS (37d)	WR 71.7% · PF 5.46 · MaxDD 4.5% · Sharpe 59.54 · Return +195.7% · Score 54.47
OOS (224d)	WR 57.4% · PF 2.21 · MaxDD 50.2% (long-range reference)
Position model	71 features · ATR-aware labels · exit_threshold=0.80
Training cutoff	2026-05-06 · Hugging Face Hub tag `v20260506`
Infrastructure	ONNX inference · Hugging Face Hub · Supabase sync
Incremental training	Warmstart enabled by default (LGB+XGB continuation, RF tree-append)

Q2 2026 (Apr – Jun) — Production hardening + quick wins

Items ordered by implementation impact and dependency.

Priority	Phase	Item
Done	15	Broker-agnostic refactor — dynamic MT5 API, `backtest.py`, `sweep.py`
Critical	15	VM cron — `weekly_optimize.py` (Sunday 02:00 UTC)
Critical	15	Broker safety audit — `python scripts/check_broker_limits.py`
Done	16	RF double-weighting fix — removed `class_weight='balanced'` from constructors
Done	16	TP corrected 0.8→1.2×ATR — R:R now 1.2, breakeven WR=45.5%
Done	16	Label/live alignment — labels now use atr_tp_mult=1.2, atr_sl_mult=1.0
Done	16	Risk model — Kelly log-utility label function + sequence augmentation (~5,000 samples)
High	16	Enable `max_daily_loss` guard — already built, set to 15 USD in config
Done	16	Equity curve filter — 10-trade rolling drawdown pause
Done	16	Walk-forward OOS gate — 3 OOS folds, WR/PF gate in `weekly_optimize.py`
Done	16	Enable Kelly lot sizing — half-Kelly (0.5 fraction) enabled; MaxDD 16.1%→7.5%
Done	21	`ml_sltp` — validated OOS; disabled (degraded Score); re-evaluate with more live data
Done	21	Trailing stop + `min_bars_held` sweep — both tested; trailing disabled, bars4 optimal
Done	18	Calibrated probability outputs — isotonic calibration live for all 3 signal models
Medium	16	Multi-account MVP — `--account` flag, per-account `state_{account}.json`

Q3 2026 (Jul – Sep) — Feature expansion + model architecture

Priority	Phase	Item
High	17	`funding_rate` + `oi_change` features (Binance REST) — retrain + OOS gate
High	17	`volatility_regime` + `w1_ema_bias` / `w1_rsi_norm` — OHLCV-derived, no external API
High	17	OHLCV data redundancy — yfinance + OKX REST as gap-fill sources
High	18	Stacked meta-learner — replace majority-vote; strict OOS gate: Score ≥ current + 1.0
High	21	M1 intra-candle features for position model — 5 features, requires retrain
High	21	Catastrophic SL / position-as-primary-exit — gated on 21.1/21.2 + EXIT precision ≥ 87%
Medium	18	Regime-switching model — trending vs ranging via ADX; gated on ≥ 20k samples per segment
Done	19	API failover — `SAFE_PAUSE` after 3 consecutive failures
Done	19	Graceful shutdown — finish-cycle exit instead of hard `sys.exit(0)`
Medium	19	Config hot-reload — poll `config.json` mtime per cycle

Q4 2026 (Oct – Dec) — Multi-asset expansion

Priority	Phase	Item
High	19	Live Streamlit dashboard — equity curve + open positions from Supabase
High	18	XAUUSD model — separate model stack, gold-specific ATR scaling, OOS PF > 2.0 gate
Medium	16	Multi-account fully wired — at least 2 brokers running simultaneously

Q1 2027 (Jan – Mar) — Multi-asset live + revenue

Priority	Phase	Item
High	18	EURUSD model — pip-based P&L, tighter spreads, forex news calendar
High	18	Multi-instrument live trading (BTCUSD + XAUUSD simultaneously)
Medium	17	`fear_greed_index` — daily sentiment (96 identical M15 values; monitor SHAP weight)
Medium	19	Per-instrument dashboards and Telegram alerts

Milestones

M1 — Full production hardening (target: May 2026)

Goal: zero-touch live operation with monitoring, alerts, and auto-maintenance.

Score >= 15.0 (achieved: 21.34 in Phase 15, Phase 16 OOS WR=65.8% PF=3.03, Phase 17.2 weekly Score=35.13, Phase 17.3 weekly Score=18.48)
Position model live with conservative thresholds
Weekly auto-retrain cron running on trading VM (code done — VM wiring pending)
PM2 performance monitor running every 4h (code done — VM wiring pending)
Telegram bot commands operational
Broker safety audit completed (docs done — live run pending)
max_daily_loss guard enabled (config change only)
Equity curve filter implemented and live
Walk-forward OOS gate wired into weekly_optimize.py
Weekly Score review cadence: Score < 10 → retrain, Score < 6 → pause

Pass criteria: Bot runs 7d without manual intervention; Telegram alert fires on simulated degradation.

M2 — Dynamic SL/TP + position model upgrades (target: Jun 2026)

Goal: confidence-aware exits that outperform the current static 1:0.8 RR.

Calibrated probability outputs live (isotonic calibration on all 3 signal models)
ml_sltp validated OOS — result: disabled (best combo Score=0.29 vs baseline 0.30; re-evaluate with more live data)
Trailing stop validated OOS — result: disabled (Score≈0.19; conflicts with partial-close breakeven SL)
min_bars_held sweep completed — optimal reverted to 4 bars post clean retrain (Score=0.66)
Kelly lot sizing enabled and compared OOS — half-Kelly (0.5) kept; MaxDD 16.1%→7.5%

Pass criteria: OOS Score ≥ 22.0 — not yet met (ml_sltp and trailing stop degraded OOS; re-evaluate after M3 feature upgrade).

M3 — Signal intelligence upgrade (target: Aug 2026)

Goal: measurably better OOS accuracy through new features and a better combiner.

funding_rate + oi_change features retrained and swept
volatility_regime + W1 features retrained and swept
Stacked meta-learner deployed (OOS Score ≥ current + 1.0)
Walk-forward: 3+ OOS folds, all WR > 55%, PF > 1.8
M1 intra-candle features for position model retrained and swept

Pass criteria: OOS Score ≥ 22.0 on 37d window; position model EXIT precision ≥ 86%.

M4 — Multi-asset live (target: Q4 2026 – Q1 2027)

Goal: second live trading instrument + multi-account operation.

XAUUSD model validated (OOS PF > 2.0) and live
EURUSD model in development (Q1 2027 target)
Regime-switching model deployed (if OOS gates pass)
Multi-account trading operational (at least 2 broker accounts)

Pass criteria: XAUUSD running live alongside BTCUSD; multi-account stable for 30d.

What was removed and why

The following items were in earlier roadmap drafts and have been removed.

Item	Reason
Optuna 150 trials + MedianPruner	`MedianPruner` already active in both tune files. Trials are a `--trials` flag — use `--trials 100` for architecture changes. Not a phase item.
Online/incremental weekly updates	Already built. Warmstart mode enabled by default in `ml/train.py` and `ml/ensemble_trainer.py`.
Confidence-based lot tiers (60/70/80%)	Kelly already does this continuously and mathematically. Discrete tiers are a cruder approximation.
Ensemble weighting by recency	Walk-forward already handles recency more robustly. Recency weighting adds a tuneable parameter with overfit risk.
`corr_spy_20` / `corr_dxy_20`	BTC–SPY/DXY correlation at M15 changes regimes frequently. Would overfit to the 2020–2026 macro correlation period.
`liq_distance_buy/sell` (Coinglass)	Historical liquidation data is non-standardized, exchange-dependent, and not reliably reproducible for backtesting. High lookahead risk.
`dom_sin` / `dom_cos`	No statistically meaningful monthly seasonality in BTC M15 data. Weak signal, overfit risk.
Synthetic OFI	Without level-2 order book data, any OFI proxy is a linear combination of OHLCV features the model already has. Redundant.
`h4_macd_hist` / `d1_macd_hist`	`h4_macd_dir` already in feature set. Histogram adds a fine-grained continuous variable over a direction signal — marginal gain, overfit surface.
Options skew (Deribit 25D RR)	Sparse historical data, evolving market structure, unclear relationship to M15 BTCUSD direction. Very high overfit risk.
Grafana + InfluxDB	Supabase already handles storage and sync. Adding a second metrics pipeline is premature complexity for the current scale.
TradingView as OHLCV source	ToS violation risk; scrapers break silently. Use official REST APIs only (yfinance, OKX, Binance).

Score formula

All retrains and config sweeps are ranked by:

Score = WR x PF / sqrt(MaxDD)

Score	Interpretation
< 6.0	Pause live trading — manual review required
6.0–10.0	Operational but below target — trigger retrain
10.0–15.0	Production-grade
> 15.0	Excellent — Phase 15: 21.34 (37d OOS); Phase 16w: 46.80 (WR=62.8%); Phase 17.2w: 35.13 (WR=67.6%, PF=2.83); Phase 17.3w: 54.47 (WR=71.7%, PF=5.46)

Version history

Phase	Date	Key change	OOS Score
1–4	2025	Initial ML model, SHAP, ATR labeling	—
5–8	2025	Phase 8 position model, Kelly sizing	—
9–11	2026-04-11	M15 timeframe, 53 features, MTF ladder	~4.5
12	2026-04-11	ADX filter, spread fix, OOS contamination fix	~5.1
13	2026-04-12	55 features, news_surprise/bars_since_news, Optuna retune	~3.5
14	2026-04-13	Config sweep, TP=0.8xATR, risk=2%, conf=0.60	15.49
15	2026-04-15	Position model ATR-aware retrain, threshold sweep	21.34
16	2026-04-24	RF double-weighting fix, TP 0.8→1.2×ATR, label/live alignment, Kelly risk model	WR=65.8% PF=3.03
16w	2026-04-24	Weekly optimize: Profile 4 Growth, risk=3.0%, cb=7	Score=46.80 WR=62.8% Return=+1154.6%
17.2	2026-04-29	Phase 17 features (volatility_regime, w1_ema_bias, w1_rsi_norm), weekly optimize, conf=0.65	Score=35.13 WR=67.6% PF=2.83
17.3	2026-05-08	Single-source training (revert multi-source), 365d recency decay weights, fresh retrain	Score=54.47 WR=71.7% PF=5.46
21	planned	ml_sltp, trailing stop, M1 features, primary-exit mode	TBD
17	planned	funding_rate, OI (remaining Phase 17 items)	TBD
18	planned	Meta-learner, calibration, regime model, XAUUSD	TBD

Active task list with checkboxes → TODO. Archived phases 1–12 → ARCHIVE.md in the repository root.

TODO

Documentation Index

​Current state — Phase 17.3

​Q2 2026 (Apr – Jun) — Production hardening + quick wins

​Q3 2026 (Jul – Sep) — Feature expansion + model architecture

​Q4 2026 (Oct – Dec) — Multi-asset expansion

​Q1 2027 (Jan – Mar) — Multi-asset live + revenue

​Milestones

​M1 — Full production hardening (target: May 2026)

​M2 — Dynamic SL/TP + position model upgrades (target: Jun 2026)

​M3 — Signal intelligence upgrade (target: Aug 2026)

​M4 — Multi-asset live (target: Q4 2026 – Q1 2027)

​What was removed and why

​Score formula

​Version history

Current state — Phase 17.3

Q2 2026 (Apr – Jun) — Production hardening + quick wins

Q3 2026 (Jul – Sep) — Feature expansion + model architecture

Q4 2026 (Oct – Dec) — Multi-asset expansion

Q1 2027 (Jan – Mar) — Multi-asset live + revenue

Milestones

M1 — Full production hardening (target: May 2026)

M2 — Dynamic SL/TP + position model upgrades (target: Jun 2026)

M3 — Signal intelligence upgrade (target: Aug 2026)

M4 — Multi-asset live (target: Q4 2026 – Q1 2027)

What was removed and why

Score formula

Version history