Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.novosky.app/llms.txt

Use this file to discover all available pages before exploring further.

Current state β€” Phase 17.3 (active)

Signal model

62 features Β· RF+XGB+LGB Β· 3-class Β· M15 Β· Optuna-tuned Β· recency-weighted
  • Training cutoff: 2026-05-06 20:04 UTC
  • Hugging Face Hub tag: v20260506

Position model

71 features (62 market + 4 state + 5 M1)
  • ATR-aware labels Β· exit_threshold=0.80
  • Accuracy: RF=22.4%, XGB=76.1%, LGB=60.2%
  • 181,414 training samples

OOS performance

WindowWRPFMaxDDSharpeReturnScore
37d β€” Phase 17.3 (weekly optimize, active)71.7%5.464.5%59.54+195.7%54.47
38d β€” Phase 17.2 (weekly optimize)67.6%2.837.3%42.30+394.5%35.13
38d β€” Phase 16w (weekly optimize)62.8%2.2614.5%37.88+1154.6%46.80
37d β€” Phase 16 (base retrain)65.8%3.037.1%49.88+390.9%β€”
37d β€” Phase 1578.5%2.431.8%32.99+50.7%β€”
224d (Sep 2025–Apr 2026)57.4%2.2150.2%13.60β€”β€”
Active config: conf=0.65 Β· prob_diff=0.08 Β· risk=2.0% Β· TP=1.0Γ—ATR Β· SL=1.0Γ—ATR Β· CB=7 Β· Profile 3 Balanced

In progress β€” immediate blockers

  • VM cron (15.2): Wire weekly_optimize.py on trading VM
  • Phase 15.5: Broker safety audit documented β€” see Broker safety audit

Phase 15 β€” Production transition

15.0 Broker-agnostic refactor

Complete β€” 2026-04-18
  • config.json no longer requires "broker" key; all scripts load spread/leverage/swap dynamically from MT5 API /account and /symbols
  • backtest.py β€” config-faithful backtester; use --max-lot/--cent-account for preset-style comparisons
  • scripts/sweep.py β€” unified sweep replacing sweep_signal.py + sweep_pos_model.py; --target signal|pos|both
  • scripts/check_broker_limits.py β€” fixed main guard; added --symbol flag

15.1 Position model validation

Complete β€” 2026-04-15
  • 38d OOS: PF=4.71 vs baseline 4.27 (+10.3%), MaxDD=4.7%, 1 ML_Exit
  • Sweep of 13 configs via scripts/sweep.py --target pos
  • Thresholds updated: exit=0.80, min_prob_diff=0.25, min_bars_held=4

15.2 Automated retrain pipeline

Complete β€” 2026-04-17
  • scripts/weekly_optimize.py β€” 13-phase autonomous pipeline (SHAP β†’ tune β†’ retrain β†’ sweep β†’ evaluate β†’ push β†’ commit β†’ notify)
  • Score improvement gate: 2% minimum to keep new models
  • Incremental warmstart training already enabled by default (ml/train.py, ml/ensemble_trainer.py)
VM task pending: Wire cron on trading VM:
0 2 * * 0  cd /path/NOVOSKY && .venv/bin/python scripts/weekly_optimize.py >> logs/weekly.log 2>&1

15.3 Cloud monitoring

Complete β€” Telegram bot commands cover live status; performance_monitor.py removed
  • Degradation detection is handled by the weekly walk-forward OOS gate in weekly_optimize.py (3 Γ— 4-week folds, WR β‰₯ 55%, PF β‰₯ 1.8)
  • Live alerts flow through trading/telegram_commands.py and scripts/notify.py

15.4 Telegram bot commands

Complete β€” 2026-04-15
/help Β· /pause Β· /resume Β· /status Β· /positions Β· /close Β· /closeall Β· /news Β· /pnl Β· /latency Auth-gated by TELEGRAM_CHAT_ID. Pause flag wired into main trade entry loop.

15.5 Broker safety audit

Documented β€” 2026-04-25
Full audit guide at Broker safety audit: tick size, stops level, lot limits, swap status, pass/fail criteria, and per-broker reference values (RoboForex, IC Markets).
  • Run audit against live RoboForex server: python scripts/check_broker_limits.py β€” PASS (2026-05-10: all limits confirmed, spread=14.59, stops_level=0, lot 0.01–100, contract_size=1.0)
  • Run audit against IC Markets when account is provisioned
  • Run audit against FundingPips when account is provisioned

15.6 Weekly validation cadence

Complete β€” 2026-04-17
Covered by weekly_optimize.py β€” runs full OOS backtest every Sunday, auto-rolls back if Score regresses.

Phase 16 β€” Risk guards & validation

Phase 16 fixes landed 2026-04-24: RF double-weighting removed (val acc 10.5%β†’49.4%), TP corrected 0.8β†’1.2Γ—ATR, labels aligned with live execution, risk model rebuilt with Kelly log-utility label function and sequence augmentation (~5,000 training samples, label std 0.000β†’0.442, val MAE=0.3084). OOS result: WR=65.8%, PF=3.03, MaxDD=7.1%, Sharpe=49.88, Return=+390.9%, 114 trades over 37 days.

16.1 Enable daily loss guard

Complete β€” 2026-05-10
max_daily_loss_pct is active in trading/bot.py (check_max_loss_profit()). Set to 3.0% of live equity β€” recomputed each cycle from account.equity.
Formula β€” dynamic daily loss limit: The guard scales with current equity, not a hard USD constant. At 500equity:limit=500 equity: limit = 15. At 2418:limit=2418: limit = 72.54.
# bot.py guard logic (active):
_loss_pct = config.get("max_daily_loss_pct")  # 3.0
_equity = _api_balance_to_usd(_acct.equity)
max_daily_loss = _equity * (_loss_pct / 100.0)
if daily_pl <= -max_daily_loss:
    send_telegram_message("🚨 [DAILY_LOSS_GUARD] ...")
    time.sleep(3600)
  • Confirmed trading/bot.py:check_max_loss_profit() used fixed USD β€” upgraded to equity-relative
  • Added max_daily_loss_pct: 3.0 to config.json; removed max_daily_loss: 99999
  • Bot computes limit = equity * pct / 100 dynamically each cycle via _get_account().equity
  • Dry-run: python trading.py --dry β€” guard logs [DAILY_LOSS_GUARD] on breach
  • Daily reset uses LOCAL_TZ_OFFSET (+7 WIB) β€” resets at WIB midnight βœ…

16.2 Equity curve filter

Complete β€” 2026-04-25
  • equity_curve_filter config block added (enabled: false, lookback_trades: 10, max_drawdown_pct: 5.0)
  • _recent_trade_pnls persisted in state.json and restored on restart
  • Entry guard fires [EC_FILTER] when net loss over last N trades β‰₯ threshold % of equity
  • Enable after validating OOS backtest with the filter active

16.3 Extend walk-forward OOS gate

Complete β€” 2026-04-25
  • Phase 7b added to scripts/weekly_optimize.py between phase 7 and phase 8
  • 3 non-overlapping 4-week OOS backtest folds run after every retrain
  • Gate: all folds WR β‰₯ 55%, all folds PF β‰₯ 1.8, median PF β‰₯ 2.0
  • Failure rolls back the snapshot immediately and skips push/commit
  • Fold results appended to logs/wf_gate.log

16.4 Enable Kelly lot sizing

Kelly lot sizing is fully implemented in trading/bot.py:1193–1257. It is disabled via config.
At average confidence of 61.5%, the Kelly criterion cuts effective risk from 6% to ~2.1% β€” similar to what the current static risk_percent achieves. The value of enabling it is that it scales up naturally at high-confidence setups and scales down at marginal ones.
  • Enable: ml_active_management.kelly_lot_sizing.enabled = true β€” already enabled
  • OOS sweep (3 runs via --mode kelly): disabled Score=0.19 MaxDD=16.1% | half=0.5 Score=0.30 MaxDD=7.5% | full=1.0 Score=0.31 MaxDD=8.7%
  • Kelly must stay on β€” disabled falls back to raw static sizing, MaxDD spikes to 16.1%
  • Keeping max_kelly_fraction: 0.5 (half-Kelly): full-Kelly wins Score by 0.01 but costs 1.2% extra MaxDD β€” not worth it at $500 balance. Re-evaluate at higher equity.

16.5 Broker-Agnostic Multi-Account Architecture

The system is designed to be broker-agnostic, decoupling the ML pipeline and trade logic from any specific broker (like RoboForex, IC Markets, FundingPips, etc.). Training and live execution will seamlessly support multiple brokers simultaneously, and eventually decentralized exchanges (DEX) like Hyperliquid. config/accounts.json schema:
{
  "accounts": {
    "exness_main": {
      "broker_id": "exness",
      "terminal_key": "terminal-one",
      "account_type": "cent",
      "currency_multiplier": 0.01,
      "risk_pct": 3.0,
      "max_lot": 5.0,
      "state_file": "state_exness_main.json"
    },
    "icmarkets_main": {
      "broker_id": "icmarkets",
      "terminal_key": "terminal-two",
      "account_type": "standard",
      "currency_multiplier": 1.0,
      "risk_pct": 2.0,
      "max_lot": 1.0,
      "state_file": "state_icmarkets_main.json"
    }
  }
}
Multi-broker dataset merging β€” deduplication and normalization: Merging MT5 feeds from different brokers introduces duplicate timestamps (broker server time can differ by Β±1 bar) and price divergence (spread differences). Normalization formula:
# Normalize close price across brokers to remove spread bias
def normalize_ohlcv(df: pd.DataFrame, broker_id: str) -> pd.DataFrame:
    spread = BROKER_SPREADS[broker_id]          # e.g., 14.59 USD for RoboForex
    df["close_norm"] = df["close"] - spread / 2  # midpoint price
    return df

# Merge: outer join on timestamp, ffill broker gaps (< 2 bars)
merged = pd.merge(df_vt, df_ic, on="timestamp", how="outer", suffixes=("_vt", "_ic"))
merged["close"] = merged[["close_vt", "close_ic"]].mean(axis=1)  # average midpoints
merged.ffill(limit=2, inplace=True)  # fill short gaps; drop if both missing
merged.dropna(subset=["close"], inplace=True)
PM2 multi-instance ecosystem config:
// ecosystem.config.js
module.exports = {
  apps: [
    { name: "novosky-roboforex",  script: "trading.py", args: "--account exness_main", interpreter: ".venv/bin/python" },
    { name: "novosky-ic",  script: "trading.py", args: "--account icmarkets_main", interpreter: ".venv/bin/python" },
    { name: "ws-server",   script: "trading/ws_server.py", interpreter: ".venv/bin/python" },
  ]
};
Tasks:
  • Create config/accounts.json with schema above; add RoboForex entry only for now
  • Add --account <account_id> flag to trading.py argument parser; load account config at startup; override risk_pct, max_lot, state_file, and API base URL from account config
  • Update trading/bot.py: replace hardcoded state.json reference with config["state_file"]; replace API_BASE_URL with terminals[account["terminal_key"]]["url"]
  • Update ml/train.py: add --brokers exness,icmarkets flag; when multiple brokers specified, fetch, normalize, and merge their OHLCV datasets; retrain on merged set
  • Unit test multi-broker merge: assert merged DataFrame has no duplicate timestamps, close prices within 0.5% of each broker’s midpoint
  • Update ecosystem.config.js with multi-process template (disabled by default β€” uncomment when IC Markets account is provisioned)
  • config/terminals.json port + domain mapping already implemented β€” RoboForex on port 6542 / terminal-rf1.novosky.app
  • Draft Hyperliquid adapter spec in docs/architecture/hyperliquid.mdx: REST endpoints for order placement (POST /exchange), position query (POST /info with type: clearinghouseState), and funding rate feed (/info with type: fundingHistory)

Phase 21 β€” Dynamic SL/TP & position model upgrades

These have the second-highest near-term ROI because the core code is already built. Items 21.1 and 21.2 require no retraining.

21.1 Enable and validate ml_sltp (confidence-scaled TP/SL at entry)

Complete β€” 2026-04-22
The Dynamic SL/TP regression model (LightGBM) is fully orchestrated in trading/bot.py. It dynamically predicts the ideal SL/TP ATR multipliers for each trade based on market volatility and regime, overriding the static fallback multipliers. It leverages Walk-Forward validation (TimeSeriesSplit) to ensure the predictions generalize well to out-of-sample regimes. At order placement, ml_sltp re-scales SL and TP using the signal model’s entry confidence:
conf_norm = clamp((confidence βˆ’ threshold) / (1.0 βˆ’ threshold), 0, 1)
effective_SL = base_sl_atr_mult Γ— ATR Γ— (1 βˆ’ conf_norm Γ— confidence_sl_adjust)
effective_TP = base_tp_atr_mult Γ— ATR Γ— (1 + conf_norm Γ— confidence_tp_adjust)
confidenceSLTPRR
60% (threshold)1.00 Γ— ATR1.50 Γ— ATR1.50
80%0.85 Γ— ATR1.75 Γ— ATR2.06
100%0.70 Γ— ATR2.25 Γ— ATR3.21
Without ml_sltp: SL = 1.0 Γ— ATR, TP = 0.8 Γ— ATR (static 1:0.8 RR, currently live). Config location: config.json β†’ ml_active_management.ml_sltp
  • Enable: set ml_sltp.enabled = true in config.json β€” already enabled
  • OOS backtest result: WR=48.8%, PF=1.72, MaxDD=9.0%, Score=0.26 (37d OOS, post-retrain)
  • Score degraded vs disabled baseline (Score=0.30, MaxDD=7.5%) β€” swept confidence_sl_adjust Γ— confidence_tp_adjust (20 combos via --mode ml_sltp)
  • Best enabled combo sladj0.4_tpadj0.5 still lost on Score (0.29) and MaxDD (9.1%) β€” disabled per todo rule. Re-evaluate after more live trades improve confidence signal quality.
  • min_tp floor check already present in the ml_sltp path (bot.py:4104–4105)
  • Removed dead base_sl_atr_mult / base_tp_atr_mult config keys β€” SLTP regression model supplies the base multipliers, not static config
atr_tp_mult: 0.8 and atr_sl_mult: 0.8 in ml_config.json β†’ labeling are training-time label parameters only β€” they determine HOLD/EXIT/ADD ground truth during training. They do not control live execution. Live TP/SL is config.json β†’ dynamic_sltp (and optionally ml_sltp).

21.2 Trailing stop + lower min_bars_held

Both are fully implemented in the bot. Both are config-only changes. No retrain required. Trailing stop (bot.py:2928–2960): ATR trail width scales with model confidence and live momentum_decay. High confidence β†’ tight trail; low confidence / adverse momentum β†’ wider trail. Currently disabled.
  • Enable: set ml_trailing_stop.enabled = true, base_trail_atr_mult = 1.2, min_profit_atr = 0.8
  • Run OOS backtest and compare Score vs baseline
  • OOS result: WR=46.6%, PF=1.45, MaxDD=12.7%, Scoreβ‰ˆ0.19 β€” worse than baseline (Score=0.30, MaxDD=7.5%). Conflicts with partial_close breakeven SL β€” trailing stop cuts winners after BE move. Disabled. Saved params in config.json._note.
  • Do not enable simultaneously with ml_sltp testing β€” change one variable at a time
min_bars_held reduction: Currently 4 bars (60 min minimum hold). At 2 bars the position model can act after 30 minutes instead of 60.
  • Set position_optimization.min_bars_held = 2 in config.json
  • Run python scripts/sweep.py --target pos (pre-retrain): Score 0.30β†’0.34, best was exit0.60/bars2. Applied temporarily.
  • Post clean-retrain sweep (2026-04-26): optimal config reverted to exit_threshold=0.80, min_prob_diff=0.25, min_bars_held=4 β€” Score=0.66, PF=2.69, DD=6.4%. After --no-warmstart, model EXIT signals more reliable at higher threshold. Applied to config.json + ml_config.json.

21.3 M1 intra-candle feature augmentation for position model

The feature cache (_latest_features_cache) is M15-derived. All 59 market features stay stale within a 15-minute candle. Adding 5 M1 scalars gives the position model intra-candle microstructure. This requires a position model retrain (breaking scaler change). Proposed M1 features:
FeatureCalculationSignal
m1_price_accel_5(close[0] βˆ’ close[5]) / close[5] on M1Intra-candle momentum
m1_vol_ratio_5mean(volume[0:5]) / mean(volume[5:10])Volume surge
m1_rsi_9Wilder RSI(9) on M1 closeFast momentum state
m1_atr_3ATR(3) normalized by closeIntra-candle volatility
m1_body_pctabs(close βˆ’ open) / (high βˆ’ low + Ξ΅)Candle conviction
Implementation (requires retrain):
  • Add compute_m1_fast_features(closes, opens, highs, lows, volumes) to ml/feature_engineering.py β€” standalone function, 5 scalars, handles short windows gracefully
  • Update ml/position_labeling.py: add M1_FEATURES constant; generate_position_samples() accepts df_m1=None; at each sample timestamp, aligns 25 M1 bars via np.searchsorted (no lookahead leak β€” uses bars at or before M15 timestamp)
  • New constant M1_FEATURES = ['m1_price_accel_5', 'm1_vol_ratio_5', 'm1_rsi_9', 'm1_atr_3', 'm1_body_pct'] β€” separate from POSITION_STATE_FEATURES (keeps 4), total position features 63β†’68
  • ml/position_predictor.py: detects M1 from position_metadata.json (n_features >= 68); get_position_action() gains m1_features=None param; neutral values substituted if model expects M1 but none provided (backward-compatible)
  • ml/position_trainer.py: generate_training_data(df_m1=None) β€” passes M1 DataFrame to labeling; updates all_feature_names and metadata when M1 features included
  • ml/train.py: Step 4 fetches M1 bars (same date range as M15, from API); passes df_m1 to pos_trainer.generate_training_data(); patches model_compat.json with M1 feature list
  • trading/bot.py: in check_ml_active_management(), fetches 30 M1 bars once per cycle (only when pos_predictor._has_m1_features=True); passes computed features to get_position_action()
  • Retrain attempted: M1 API returned empty β€” MT5 HTTP server must be running during --refresh to populate M1 cache. Position model retrained on 63 features (same as before). No regression in val accuracy (XGB=0.753, LGB=0.743).
  • M1 cache solution implemented: datasets/training_data_btcusd_m1.csv (same pattern as M15 CSV). --refresh with API running fetches and saves it; subsequent trains load from cache automatically. Path configurable via ml_config.json paths.m1_training_data.
  • OOS backtest post-retrain (2026-04-26): WR=57.5%, PF=2.86, MaxDD=7.5%, Score=0.60 β€” doubled baseline score (was 0.30). SL hits: 24 (was 66–151). ML exits: 89/233 (38%). Re-sweep after retrain identified better config (0.80/0.25/bars4 β†’ Score=0.66); applied.
  • M1 retrain (when API running): python train_ml_model.py --ensemble --position --refresh --no-warmstart
    • --refresh fetches live M1 bars from MT5 API and saves to datasets/training_data_btcusd_m1.csv
    • --no-warmstart forces a clean fit (warmstart would load old 63-feature weights into 68-feature model)
    • Confirm MT5 HTTP server is running before starting: curl http://localhost:6542/health
  • OOS sweep after M1 retrain: python scripts/sweep.py --target pos β€” sweeps exit_threshold Γ— min_prob_diff Γ— min_bars_held grid; apply best config to config.json
  • Gate β€” current best is PF=2.86 (Score=0.60), so PF β‰₯ 4.0 is aspirational not immediate:
    • Accept retrain if: OOS PF β‰₯ current PF (2.86) AND EXIT precision β‰₯ 85% AND MaxDD ≀ current + 2%
    • Aspirational target: PF β‰₯ 4.0 once M1 cache is fully populated (more M1 bars = better features)
    • If retrain regresses (PF < 2.86): revert via ml/hf_hub.py --pull and investigate which M1 feature adds noise
Do this after 21.2 no-retrain items are validated and live. You want a clean baseline before adding features.

21.4 Catastrophic SL + position model as primary exit

Wide hard SL as safety net only. No hard TP. Position model drives all exits. Stays open longer β€” exits on regime deterioration rather than a fixed ATR multiple. Preconditions (all must be met before enabling):
Precondition 1 originally referenced ml_sltp (Phase 21.1) β€” but ml_sltp was disabled in OOS testing (Score degraded). That precondition is now replaced with a position model precision gate only.
  1. Phase 21.1 (ml_sltp) validated OOS β€” removed (ml_sltp is disabled; position-as-primary-exit does not depend on confidence-scaled TP/SL)
  2. Position model EXIT precision β‰₯ 87% on OOS (current: 85%) β€” this is the primary readiness gate
  3. β‰₯ 200 live dry-run trades with ml_active_management.enabled = true confirming EXIT fires at correct rate (target: ML_EXIT β‰₯ 60% of closes)
  4. Telemetry already in place: each close logged as SL_HIT, TP_HIT, or ML_EXIT in ml_performance.csv βœ… (implemented)
Implementation:
"ml_active_management": {
  "position_as_primary_exit": {
    "enabled": false,
    "catastrophic_sl_atr_mult": 3.0,
    "emergency_tp_atr_mult": 5.0
  }
}
  • position_as_primary_exit config block added to config.json (enabled: false)
  • bot.py wired: if enabled, overrides effective_sl = ATR Γ— catastrophic_sl_atr_mult, effective_tp = ATR Γ— emergency_tp_atr_mult, logs [PRIMARY_EXIT_MODE]
  • SL_HIT / TP_HIT / ML_EXIT telemetry added to ml_performance.csv via _close_type() helper OOS test methodology for wide-SL:
# Step 1: Measure wide-SL alone (TP still on) β€” understand the RR change
python backtest.py --oos-only --no-chart \
    --override "dynamic_sltp.sl_atr_multiplier=3.0"

# Step 2: Full mode β€” wide SL + no TP (position model drives exits)
python backtest.py --oos-only --no-chart \
    --override "position_as_primary_exit.enabled=true" \
    --override "dynamic_sltp.sl_atr_multiplier=3.0" \
    --override "dynamic_sltp.tp_atr_multiplier=99.0"  # effectively no TP

# Accept only if: WR β‰₯ 55%, MaxDD ≀ 20%, ML_EXIT rate β‰₯ 60% in simulation
Expected behavior when enabled: Average trade duration increases from ~2h to ~6–12h. MaxDD can spike during gap events (weekend crypto gaps). The 3.0Γ—ATR hard SL is the last line of defense.
  • Run Step 1 OOS (wide-SL alone): log result to logs/wide_sl_test.log; accept if Score β‰₯ current βˆ’ 2.0 (some score loss from wider SL is expected)
  • Run Step 2 OOS (position-as-primary): accept only if all gate conditions above are met
  • Dry-run 100 trades: tail pm2 logs novosky --lines 200 --nostream after each close; count ML_EXIT vs SL_HIT vs TP_HIT; confirm ML_EXIT β‰₯ 60%
  • Only enable on live after all preconditions are met; announce in Telegram: [PRIMARY_EXIT_MODE] Enabled β€” SL=3.0Γ—ATR, no TP

21.5 Risk model scope β€” architectural constraint

The risk model outputs a scalar multiplier [0.10, 1.25] applied to base_risk_pct. Its 7 features are equity-state scalars only. It must not control SL/TP distances.
ConcernCorrect mechanism
Lot size based on equity healthRisk model β†’ effective_risk_pct
TP/SL based on signal confidenceml_sltp (Phase 21.1)
Intra-trade exit timingPosition model (Phase 21.3/21.4)
Protective trailing SLml_trailing_stop (Phase 21.2)
  • Add a Scope: section to ml/risk_predictor.py module docstring stating the model outputs lot-sizing multipliers only

Phase 17 β€” Feature engineering

Add new market signals before the next major retrain. Implement in ml/feature_engineering.py. Each feature requires a full retrain + OOS validation before going live.
Add features one group at a time. Retrain after each group. Check that OOS Score does not degrade. Adding too many features at once makes it impossible to identify what hurt or helped.

17.1 On-chain & derivatives features

These have direct theoretical backing for BTCUSD direction β€” funding rate and OI are the primary sentiment signals used by professional crypto traders. Dependency: pip install requests pandas funding_rate β€” formula and normalization: Binance perpetual funding settles every 8 h. Rate represents cost of holding long vs short (positive = longs pay shorts β†’ bearish pressure; negative = shorts pay longs β†’ bullish pressure).
# ml/data_sources/binance.py
import requests, pandas as pd

def fetch_funding_rate(symbol="BTCUSDT", limit=500) -> pd.DataFrame:
    url = "https://fapi.binance.com/fapi/v1/fundingRate"
    r = requests.get(url, params={"symbol": symbol, "limit": limit}, timeout=10)
    df = pd.DataFrame(r.json())
    df["timestamp"] = pd.to_datetime(df["fundingTime"], unit="ms", utc=True)
    df["funding_rate"] = df["fundingRate"].astype(float)
    # Normalize: clamp to [-0.003, 0.003] (99th percentile range), then scale to [-1, 1]
    CLIP = 0.003
    df["funding_rate_norm"] = df["funding_rate"].clip(-CLIP, CLIP) / CLIP
    return df[["timestamp", "funding_rate_norm"]].set_index("timestamp")

# Forward-fill to M15 grid:
def merge_funding_to_m15(df_m15: pd.DataFrame, df_funding: pd.DataFrame) -> pd.DataFrame:
    df_m15 = df_m15.join(df_funding, how="left")
    df_m15["funding_rate_norm"] = df_m15["funding_rate_norm"].ffill().fillna(0.0)
    return df_m15
oi_change β€” rolling delta formula:
def fetch_open_interest(symbol="BTCUSDT") -> pd.DataFrame:
    url = "https://fapi.binance.com/futures/data/openInterestHist"
    r = requests.get(url, params={"symbol": symbol, "period": "15m", "limit": 500}, timeout=10)
    df = pd.DataFrame(r.json())
    df["timestamp"] = pd.to_datetime(df["timestamp"], unit="ms", utc=True)
    df["oi"] = df["sumOpenInterest"].astype(float)
    # 4h rolling pct change = 16 Γ— 15min bars
    df["oi_change"] = df["oi"].pct_change(16).clip(-1, 1).fillna(0.0)
    return df[["timestamp", "oi_change"]].set_index("timestamp")
fear_greed_index β€” low-information warning: 96 identical M15 values per day from this feature β†’ SHAP will be near zero unless daily pivots correlate with session opens. Include on a trial basis; drop if SHAP < 0.001 after retrain.
def fetch_fear_greed() -> pd.DataFrame:
    r = requests.get("https://api.alternative.me/fng/?limit=365", timeout=10)
    df = pd.DataFrame(r.json()["data"])
    df["timestamp"] = pd.to_datetime(df["timestamp"].astype(int), unit="s", utc=True)
    df["fear_greed"] = df["value"].astype(float) / 100.0  # normalize 0–1
    return df[["timestamp", "fear_greed"]].set_index("timestamp")
Integration tasks:
  • Create ml/data_sources/binance.py with fetch_funding_rate, fetch_open_interest, fetch_fear_greed β€” each returns a UTC-indexed DataFrame
  • Add to ml/train.py data loading block: after df_m15 is built, call all three fetchers and merge via left-join + ffill β€” same pattern as M1 features
  • Add feature names to model_compat.json["features"] in this order (append at end): funding_rate_norm, oi_change, fear_greed
  • Run python scripts/retrain.py --ensemble --no-warmstart with the 3 new features; run ml/shap_analysis.py; remove fear_greed from model_compat.json if its mean absolute SHAP < 0.001
  • Cache fetched data to datasets/funding_rate.csv, datasets/oi_change.csv, datasets/fear_greed.csv β€” same pattern as training_data_btcusd_m1.csv
  • OOS gate: ensemble with on-chain features must have Score β‰₯ current + 0.5; if not, revert β€” on-chain features add API call latency so they must pay for themselves

17.2 Market regime features

These are derived entirely from existing OHLCV data β€” no external API dependencies.
  • volatility_regime β€” Compute ATR(14) percentile rank over a rolling 500-bar window; encode as continuous 0–1 (not bucketed) to avoid artificial boundaries
  • w1_ema_bias β€” Resample M15 to W1 (504 bars); compute (close βˆ’ EMA(10)) / close; forward-fill
  • w1_rsi_norm β€” RSI(14) on W1 bars, normalized to 0–1; forward-fill
Weekly features follow the same pattern as existing H4/D1 resampling in calculate_mtf_features(). Add them to the same function.

17.3 OHLCV data redundancy pipeline

Reverted. ml/data_sources.py and the multi-source consensus-averaging pipeline were removed in commit e8fd0da. Root cause: averaging OHLCV across Exness and RoboForex at the same timestamps produced artificial prices that didn’t match the live Exness feed, causing a training-live distribution mismatch that degraded fold_3 WF performance (PF=1.34). Training now uses a single source (API_URL in .env) with symbol auto-detection. TRAINING_SOURCE_* and TRAINING_YFINANCE env vars are no longer read by any code. Multi-source redundancy may be revisited in a future phase with a primary-source-wins merge strategy (no averaging) rather than consensus blending.

17.4 Smart Money Concepts (SMC) features

Add institutional order-flow structure as ML features using the smartmoneyconcepts library. SMC theory models how large players move price toward liquidity, making it a natural complement to the existing momentum and volatility features for BTC/USD. Dependency: already installed in examples/python/. Add to requirements.txt for the main project.
pip install smartmoneyconcepts
Lookahead bias β€” hard rule. The library returns forward-looking columns: ob["MitigatedIndex"], fvg["MitigatedIndex"], bos_choch["BrokenIndex"]. These reference future candles and must never be used as ML features. Only use formation-time data (the candle where the structure was detected). Violating this will silently inflate OOS metrics and cause live failure.

Features to add

All features are computed from M15 OHLCV data only β€” no external API dependency. Order Block (OB) features β€” zones where institutional orders are resting:
FeatureFormulaRationale
ob_bullish_dist(close βˆ’ nearest_bullish_ob_top) / atr_14ATR-normalised distance to nearest demand zone below
ob_bearish_dist(nearest_bearish_ob_bottom βˆ’ close) / atr_14ATR-normalised distance to nearest supply zone above
ob_bullish_present1 if any unmitigated bullish OB within 3Γ—ATR else 0Binary: price is near a demand zone
ob_bearish_present1 if any unmitigated bearish OB within 3Γ—ATR else 0Binary: price is near a supply zone
ob_volume_ratioob_volume / rolling_mean_volume(50)Strength of the most recent OB (high volume = stronger zone)
Fair Value Gap (FVG) features β€” price imbalances that act as magnets:
FeatureFormulaRationale
fvg_bull_above1 if unmitigated bullish FVG above current close else 0Unfilled imbalance pulling price up
fvg_bear_below1 if unmitigated bearish FVG below current close else 0Unfilled imbalance pulling price down
fvg_bull_dist(nearest_bull_fvg_bottom βˆ’ close) / atr_14Normalised distance to nearest bullish FVG
fvg_bear_dist(close βˆ’ nearest_bear_fvg_top) / atr_14Normalised distance to nearest bearish FVG
Structural features (BOS / CHoCH) β€” trend continuation vs reversal context:
FeatureFormulaRationale
recent_bos1 if a BOS occurred in the last 8 bars else 0Trend continuation bias β€” momentum context
recent_choch1 if a CHoCH occurred in the last 8 bars else 0Reversal bias β€” structural flip context
structure_bias+1 BOS, βˆ’1 CHoCH, 0 neither (last 16 bars)Single signed feature combining both signals
Liquidity features β€” stop-hunt zones where institutional orders trigger:
FeatureFormulaRationale
liq_above_dist(nearest_liq_above βˆ’ close) / atr_14Distance to the nearest pool of buy-side stops
liq_below_dist(close βˆ’ nearest_liq_below) / atr_14Distance to the nearest pool of sell-side stops

Implementation

Add add_smc_features(df) to ml/feature_engineering.py. The function must only use df[:i] at each row β€” no forward lookahead. Use swing_length=10 as the default (matches existing indicators.py usage).
# ml/feature_engineering.py

def add_smc_features(df: pd.DataFrame, swing_length: int = 10) -> pd.DataFrame:
    """Add SMC-derived features. No forward-looking columns used."""
    from smartmoneyconcepts import smc

    swing_hl = smc.swing_highs_lows(df, swing_length=swing_length)

    # Order Blocks β€” formation data only, strip MitigatedIndex
    ob = smc.ob(df, swing_hl)[["OB", "Top", "Bottom", "OBVolume"]]

    # Fair Value Gaps β€” formation data only, strip MitigatedIndex
    fvg = smc.fvg(df)[["FVG", "Top", "Bottom"]]

    # BOS / CHoCH β€” formation data only, strip BrokenIndex
    bos_choch = smc.bos_choch(df, swing_hl)[["BOS", "CHOCH", "Level"]]

    # Liquidity levels β€” formation data only
    liq = smc.liquidity(df, swing_hl)[["Liquidity", "Level"]]

    atr = df["atr_14"] if "atr_14" in df.columns else df["high"].combine(df["low"], max) - df["low"]

    close = df["close"]

    # --- OB features ---
    bull_ob_mask = ob["OB"] == 1
    bear_ob_mask = ob["OB"] == -1

    df["ob_bullish_present"] = 0
    df["ob_bearish_present"] = 0
    df["ob_bullish_dist"] = float("nan")
    df["ob_bearish_dist"] = float("nan")
    df["ob_volume_ratio"] = float("nan")

    vol_mean = df["tick_volume"].rolling(50, min_periods=1).mean()

    for i in range(len(df)):
        c = close.iloc[i]
        a = atr.iloc[i]
        if pd.isna(a) or a == 0:
            continue

        past_bull = ob[bull_ob_mask].iloc[:i]
        past_bear = ob[bear_ob_mask].iloc[:i]

        if not past_bull.empty:
            dists = (c - past_bull["Top"]) / a
            near = dists[(dists >= -3) & (dists <= 3)]
            if not near.empty:
                idx_min = near.abs().idxmin()
                df.at[df.index[i], "ob_bullish_present"] = 1
                df.at[df.index[i], "ob_bullish_dist"] = float(near[idx_min])
                df.at[df.index[i], "ob_volume_ratio"] = (
                    ob.at[idx_min, "OBVolume"] / vol_mean.iloc[i]
                    if vol_mean.iloc[i] > 0 else 0.0
                )

        if not past_bear.empty:
            dists = (past_bear["Bottom"] - c) / a
            near = dists[(dists >= -3) & (dists <= 3)]
            if not near.empty:
                idx_min = near.abs().idxmin()
                df.at[df.index[i], "ob_bearish_present"] = 1
                df.at[df.index[i], "ob_bearish_dist"] = float(near[idx_min])

    # --- FVG features ---
    bull_fvg = fvg[fvg["FVG"] == 1]
    bear_fvg = fvg[fvg["FVG"] == -1]

    df["fvg_bull_above"] = 0
    df["fvg_bear_below"] = 0
    df["fvg_bull_dist"] = float("nan")
    df["fvg_bear_dist"] = float("nan")

    for i in range(len(df)):
        c = close.iloc[i]
        a = atr.iloc[i]
        if pd.isna(a) or a == 0:
            continue
        past_bull_fvg = bull_fvg.iloc[:i]
        past_bear_fvg = bear_fvg.iloc[:i]
        if not past_bull_fvg.empty:
            above = past_bull_fvg[past_bull_fvg["Bottom"] > c]
            if not above.empty:
                nearest = (above["Bottom"] - c).idxmin()
                df.at[df.index[i], "fvg_bull_above"] = 1
                df.at[df.index[i], "fvg_bull_dist"] = (above.at[nearest, "Bottom"] - c) / a
        if not past_bear_fvg.empty:
            below = past_bear_fvg[past_bear_fvg["Top"] < c]
            if not below.empty:
                nearest = (c - below["Top"]).idxmin()
                df.at[df.index[i], "fvg_bear_below"] = 1
                df.at[df.index[i], "fvg_bear_dist"] = (c - below.at[nearest, "Top"]) / a

    # --- BOS / CHoCH features ---
    df["recent_bos"] = (
        bos_choch["BOS"].rolling(8, min_periods=1).apply(lambda x: int(x.notna().any()))
    )
    df["recent_choch"] = (
        bos_choch["CHOCH"].rolling(8, min_periods=1).apply(lambda x: int(x.notna().any()))
    )
    df["structure_bias"] = df["recent_bos"].astype(int) - df["recent_choch"].astype(int)

    # --- Liquidity features ---
    liq_levels = liq[liq["Liquidity"].notna()]["Level"]

    df["liq_above_dist"] = float("nan")
    df["liq_below_dist"] = float("nan")

    for i in range(len(df)):
        c = close.iloc[i]
        a = atr.iloc[i]
        if pd.isna(a) or a == 0:
            continue
        past_liq = liq_levels.iloc[:i]
        if past_liq.empty:
            continue
        above = past_liq[past_liq > c]
        below = past_liq[past_liq < c]
        if not above.empty:
            df.at[df.index[i], "liq_above_dist"] = (above.min() - c) / a
        if not below.empty:
            df.at[df.index[i], "liq_below_dist"] = (c - below.max()) / a

    df[["ob_bullish_dist", "ob_bearish_dist", "ob_volume_ratio",
        "fvg_bull_dist", "fvg_bear_dist",
        "liq_above_dist", "liq_below_dist"]] = (
        df[["ob_bullish_dist", "ob_bearish_dist", "ob_volume_ratio",
            "fvg_bull_dist", "fvg_bear_dist",
            "liq_above_dist", "liq_below_dist"]]
        .clip(-10, 10)
        .fillna(0.0)
    )

    return df
The row-by-row loop is O(nΒ²) and will be slow on the full training dataset (300k+ bars). Vectorise using pd.merge_asof or precompute a rolling lookup table once the feature set is validated. Optimise only after SHAP confirms the features are useful.

Feature names to add to model_compat.json

Append in this order (after existing features, before any on-chain features from 17.1):
"ob_bullish_present", "ob_bearish_present",
"ob_bullish_dist", "ob_bearish_dist", "ob_volume_ratio",
"fvg_bull_above", "fvg_bear_below",
"fvg_bull_dist", "fvg_bear_dist",
"recent_bos", "recent_choch", "structure_bias",
"liq_above_dist", "liq_below_dist"
That is 14 new features, taking the ensemble from 62 β†’ 76 features.

Integration tasks

  • Add smartmoneyconcepts to requirements.txt
  • Implement add_smc_features(df) in ml/feature_engineering.py β€” no MitigatedIndex / BrokenIndex columns may appear in the feature matrix
  • Add the 14 feature names to model_compat.json["features"] (append at end)
  • Run python scripts/retrain.py --ensemble --no-warmstart with all 14 features
  • Run ml/shap_analysis.py β€” inspect beeswarm plot; drop any feature with mean absolute SHAP < 0.001 after retrain
  • If more than 4 features are pruned by SHAP, split into two groups (OB+FVG first, BOS+liquidity second) and add one group at a time
  • OOS gate: SMC ensemble Score must be β‰₯ current Score + 0.5 to ship; if not, revert model_compat.json and remove the features
  • Performance note: validate that add_smc_features completes in < 60s on the full training dataset; if slower, vectorise before merging to main

Expected impact

  • OB / FVG features β€” highest potential. Price returning to an OB or being drawn toward an FVG is a well-documented BTC pattern. Should show non-zero SHAP especially on ob_bullish_dist and fvg_bull_above.
  • BOS / CHoCH features β€” structural regime context. structure_bias directly encodes trend vs reversal, complementing existing h4_ema_bias and d1_trend.
  • Liquidity features β€” encodes stop-hunt dynamics. Less certain; BTC liquidity grabs are real but noisier at M15 than on higher timeframes.

Phase 18 β€” Model architecture

18.1 Stacked meta-learner

Replace majority-vote with a learned combiner. The meta-learner trains on the base models’ class probabilities (out-of-fold) and learns optimal weights per class.
Data leakage risk: Using StratifiedKFold on time-series data leaks future bars into past training folds. Always use TimeSeriesSplit which enforces chronological ordering. This is a hard rule β€” no exceptions.
This is the highest overfit risk item in the entire roadmap. The meta-layer sees the validation set probability distributions and can memorize them. Strict OOS gate is mandatory before deploying.
OOF stacking algorithm (time-series safe):
from sklearn.model_selection import TimeSeriesSplit  # NOT StratifiedKFold

def generate_oof_stacks(models: list, X: np.ndarray, y: np.ndarray,
                        n_splits: int = 5) -> tuple[np.ndarray, np.ndarray]:
    """
    Returns X_meta (N_oof, K*3), y_meta (N_oof,).
    Uses walk-forward folds: each fold trains on [0..t], predicts [t..t+step].
    First fold's training samples are discarded (no OOF predictions for them).
    """
    tscv = TimeSeriesSplit(n_splits=n_splits)
    K = len(models)         # number of base models (3 currently, 5+ in Phase 22)
    X_meta = np.zeros((len(X), K * 3))
    mask = np.zeros(len(X), dtype=bool)

    for train_idx, val_idx in tscv.split(X):
        for k, model in enumerate(models):
            m_clone = clone(model).fit(X[train_idx], y[train_idx])
            probs = m_clone.predict_proba(X[val_idx])  # (n_val, 3)
            X_meta[val_idx, k*3:(k+1)*3] = probs
        mask[val_idx] = True

    return X_meta[mask], y[mask]   # drop rows with no OOF prediction (first fold train set)
Meta-learner choices (ordered by overfit risk, lowest first):
OptionModelOverfit riskUse when
ALogisticRegression(C=0.1, max_iter=1000)Very low≀ 3 base models
BLGBMClassifier(max_depth=3, n_estimators=100)Low4–6 base models
CMLPClassifier(hidden_layer_sizes=(32,), max_iter=500)MediumAvoid for now
Start with Option A. Graduate to Option B only when Phase 22 adds 2+ neural models. Calibration of meta-learner output:
from sklearn.isotonic import IsotonicRegression

# After training meta-learner, calibrate on a held-out fold (never on test set)
meta_probs_cal = meta_model.predict_proba(X_cal_meta)  # cal = held-out fold
calibrators = []
for c in range(3):
    ir = IsotonicRegression(out_of_bounds="clip")
    ir.fit(meta_probs_cal[:, c], (y_cal == c).astype(float))
    calibrators.append(ir)

def calibrate(raw_probs: np.ndarray) -> np.ndarray:
    cal = np.column_stack([calibrators[c].transform(raw_probs[:, c]) for c in range(3)])
    cal /= cal.sum(axis=1, keepdims=True)  # renormalize rows
    return cal
Tasks:
  • Implement generate_oof_stacks in ml/meta_learner.py using TimeSeriesSplit(n_splits=5) β€” not StratifiedKFold
  • Start with LogisticRegression(C=0.1) meta-learner (Option A); switch to LGB after Phase 22 adds neural models
  • Add --meta flag to scripts/retrain.py that triggers OOF generation + meta-learner training after base model training
  • Apply isotonic calibration on a chronological held-out fold (last 10% of training data); save calibrators to models/meta_calibrators.pkl
  • OOS gate β€” recalibrated targets (current best is WR=67.6%, PF=2.83, Score=35.13):
    • Keep if OOS Score β‰₯ current Score + 1.0
    • Keep if OOS WR β‰₯ 60% (not 75% β€” that was unreachable; WR > 75% would indicate overfit, not improvement)
    • Keep if OOS MaxDD ≀ current MaxDD + 2%
    • If gate fails: revert to majority-vote, log result to logs/meta_learner_eval.log, do not re-attempt until base models are retrained

18.2 Calibrated probability outputs

The signal model probabilities are not inherently calibrated (a 70% confidence prediction should be right ~70% of the time). Calibration improves confidence-based downstream decisions like ml_sltp and Kelly sizing.
  • calibrate_models() added to ml/ensemble_trainer.py β€” applies _IsotonicCalibratedClassifier (sklearn-version-agnostic wrapper) to each base model after training; saves *_calibrated.pkl alongside raw pkl
  • _load_model() in ml/ensemble_predictor.py updated: calibrated pkl takes priority over ONNX (calibration cannot be applied to ONNX sessions), ONNX is second, raw pkl is fallback
  • Verified calibration on val set via ml/calibration_check.py: all three models PASS (RF MAE=0.012, XGB MAE=0.018, LGB MAE=0.014, ensemble MAE=0.030 β€” well within 0.05 threshold)
  • Completed before enabling ml_sltp (Phase 21.1) β€” calibrated probabilities ready

18.3 Regime-switching model

Train two separate signal model variants: one on trending data (ADX > 0.25) and one on ranging data (ADX ≀ 0.25). At inference, the active ADX regime routes to the correct model.
This doubles the number of models to maintain. Only implement if walk-forward OOS shows that the single model performs significantly worse in one regime. Run the diagnostic below before building anything.
Prerequisite diagnostic β€” measure single-model regime performance:
# Run this before deciding to build regime-switching models
import pandas as pd
from backtest import run_backtest

df_oos = pd.read_csv("datasets/oos_signals.csv")  # generated by backtest.py

# Split by regime
trending = df_oos[df_oos["adx_14"] > 0.25]
ranging  = df_oos[df_oos["adx_14"] <= 0.25]

# Check performance gap: if both regimes have WR > 55%, single model is fine
for name, subset in [("trending", trending), ("ranging", ranging)]:
    wr = (subset["pnl"] > 0).mean()
    pf = subset[subset["pnl"]>0]["pnl"].sum() / abs(subset[subset["pnl"]<0]["pnl"].sum())
    print(f"{name}: n={len(subset)}, WR={wr:.1%}, PF={pf:.2f}")
Only proceed if: trending WR < 55% OR ranging WR < 55% AND each segment has β‰₯ 20,000 training samples. Hysteresis rule for regime transitions (prevents model-flip thrashing): ADX fluctuates around the threshold. Without hysteresis, the model can switch dozens of times in a session. Apply a buffer zone:
# Uses 3 consecutive bars of new regime before switching model
TRENDING_THRESHOLD = 0.28   # Enter trending above this
RANGING_THRESHOLD  = 0.22   # Enter ranging below this (gap = hysteresis band)

def get_active_model(adx_history: deque, current_model: str) -> str:
    recent = list(adx_history)[-3:]   # last 3 bars
    if all(a > TRENDING_THRESHOLD for a in recent):  return "trending"
    if all(a < RANGING_THRESHOLD  for a in recent):  return "ranging"
    return current_model   # stay in current regime β€” hysteresis
Gate β€” recalibrated (Score context: current single model = 35.13):
  • Trending-regime Score β‰₯ 17.0 (better than full-model 35.13 / 2 = 17.6 is the theoretical minimum for half the trades; 17.0 is conservative given trending is the easier regime)
  • Ranging-regime Score β‰₯ 10.0 (ranging is harder; acceptable if at least net-positive)
  • Combined (blended) Score β‰₯ current 35.13 + 2.0 (must beat single model or not worth the complexity)
Tasks:
  • Run regime diagnostic above on the latest OOS dataset before doing anything else; log results to logs/regime_diagnostic.txt
  • If diagnostic shows no regime gap (both WR > 55%): skip this phase, mark as deferred
  • If gap found: segment datasets/training_data_btcusd.csv by adx_14; verify β‰₯ 20k rows per segment
  • Train two ensembles: python scripts/retrain.py --ensemble --regime trending and --regime ranging; each saves to models/signal_trending/ and models/signal_ranging/
  • Add RegimeSwitchPredictor to ml/ensemble_predictor.py: maintains adx_history = deque(maxlen=3); calls get_active_model() before each inference; loads both model sets into memory at startup
  • OOS backtest: run backtest.py --regime-switch which routes each bar to the correct model; compare blended Score to single-model Score
  • Gate as above; if blended Score < current + 2.0: abandon regime switching and document

18.4 Multi-instrument expansion

Each instrument gets its own dedicated model stack β€” never shared with BTCUSD. Architecture per instrument:
models/
  BTCUSD/              # existing
    signal_rf.onnx
    signal_xgb.onnx
    signal_lgb.onnx
    position_rf.onnx
    ...
    ensemble_scaler.pkl
    model_compat.json
  XAUUSD/              # new
    signal_rf.onnx
    ...
    ensemble_scaler.pkl   # SEPARATE scaler β€” gold ATR is 100Γ— smaller than BTC
    model_compat.json     # may differ in feature list (e.g. no dist_to_round_number for gold)

datasets/
  training_data_btcusd.csv
  training_data_xauusd.csv   # new β€” fetched from MT5 XAUUSD M15
  training_data_eurusd.csv   # new

ml_config.json       # BTCUSD defaults
ml_config_xauusd.json  # gold-specific overrides
ml_config_eurusd.json  # forex-specific overrides
XAUUSD labeling differences β€” ATR-aware label params must be rescaled: Gold ATR is ~10–30vsBTCATRΒ 10–30 vs BTC ATR ~500–2000. The existing SL=0.8Γ—ATR, TP=1.2Γ—ATR proportions are valid, but the min_atr filter (currently 15 for BTC) needs a per-symbol value:
// ml_config_xauusd.json
{
  "labeling": {
    "atr_sl_multiplier": 0.8,
    "atr_tp_multiplier": 1.2,
    "min_atr": 0.5,         // gold: $0.50 minimum ATR (not $15 like BTC)
    "lookahead_candles": 48  // same
  },
  "training": {
    "min_samples": 50000     // gold has fewer liquid M15 bars; lower threshold
  }
}
EURUSD labeling differences: Pip value for EURUSD standard lot = $10/pip. Convert P&L to USD in backtest.py:
# backtest.py β€” pip_value must be symbol-aware
PIP_VALUE = {
    "BTCUSD": 100.0,   # USC/lot/point for cent account
    "XAUUSD": 100.0,   # $1/lot/point Γ— 100 for cent
    "EURUSD": 10.0,    # $10/pip/lot standard; 0.1/pip/lot for micro
}
Tasks:
  • Add --symbol flag to ml/train.py; when set, load ml_config_{symbol.lower()}.json instead of default; save models to models/{SYMBOL}/
  • Add --symbol flag to backtest.py; load correct scaler and model directory; use symbol-specific pip_value in P&L calculation
  • Create ml_config_xauusd.json with XAUUSD-specific min_atr, labeling params; keep all other params as BTC defaults initially
  • Fetch 2 years of XAUUSD M15 from MT5 API: python ml/train.py --symbol XAUUSD --refresh β€” saves to datasets/training_data_xauusd.csv
  • Train XAUUSD model stack: python scripts/retrain.py --symbol XAUUSD --ensemble --position --no-warmstart
  • OOS gate for XAUUSD: PF > 2.0 on 60-day OOS window; MaxDD < 15%
  • EURUSD: defer until XAUUSD is live-validated; same pipeline applies
Never share model files or scalers across instruments. ensemble_scaler.pkl is fit on each symbol’s feature distribution independently. Using BTC scaler on gold data will produce garbage predictions.

Phase 19 β€” Infrastructure & reliability

19.1 Live trade dashboard

Superseded by Phase 23 (JARVIS Dashboard) β€” the Next.js WebSocket dashboard covers all use cases planned here with better UX. This Streamlit version is now a lightweight fallback for quick server-side monitoring without the full frontend stack.
Dependency: pip install streamlit>=1.35 supabase>=2.5 The Streamlit fallback is useful when SSH-ed into the trading VM and needing a quick equity snapshot without opening the browser dashboard.
  • Build scripts/dashboard_live.py:
    import streamlit as st
    from supabase import create_client
    import pandas as pd
    
    st.set_page_config(page_title="NOVOSKY Live", layout="wide", page_icon="πŸ“ˆ")
    sb = create_client(os.getenv("SUPABASE_URL"), os.getenv("SUPABASE_KEY"))
    
    @st.cache_data(ttl=30)
    def get_equity():
        r = sb.table("account_snapshots").select("equity,created_at").order("created_at", desc=True).limit(500).execute()
        return pd.DataFrame(r.data)
    
    col1, col2 = st.columns([2, 1])
    with col1:
        st.line_chart(get_equity().set_index("created_at")["equity"])
    with col2:
        st.metric("Latest Equity", f"${get_equity().iloc[0]['equity']:.0f}")
    
  • Views: equity area chart (500 snapshots), open position P&L, last 20 signals table with confidence color-coding
  • Deploy via PM2 port 8501: streamlit run scripts/dashboard_live.py --server.port 8501 --server.headless true; expose under terminal-rf1.novosky.app/monitor via Caddy

19.2 API failover

Complete β€” 2026-04-25
  • _api_fail_count and _api_paused globals track consecutive MT5 API failures in trading/bot.py
  • After 3 consecutive failures: Telegram alert [API UNREACHABLE] fired, _api_paused = True blocks new entries
  • Resumes automatically when API responds; _api_fail_count cleared, _api_paused = False
  • Guard wraps the _get_rates() call in the main loop

19.3 Graceful shutdown improvements

Complete β€” 2026-04-25
  • _shutdown_requested flag replaces sys.exit(0) in the SIGTERM handler
  • Main loop top: if flag is set, polls open positions and exits cleanly once count reaches 0
  • Entry gate blocks new trades while shutdown is pending
  • Logs [SHUTDOWN] with position count on clean exit

19.4 Config hot-reload

Safe vs unsafe keys β€” not all config changes can be applied without restart:
KeyHot-reloadable?Reason
risk_percentβœ… YesOnly affects next lot sizing call
max_daily_loss_pctβœ… YesGuard checked every cycle
min_confidenceβœ… YesFilter applied at signal gate
adx_regime_filter.*βœ… YesChecked at signal gate
circuit_breaker.*βœ… YesState counter reset is safe
model_paths.*❌ NoRequires model reload β†’ restart
symbol❌ NoWould orphan tracked positions
api_base_url❌ NoActive connections would break
kelly_lot_sizing.*⚠️ CarefulOnly safe if no open position
Implementation β€” mtime polling:
# trading/bot.py β€” add to main loop top
_config_mtime: float = 0.0
_SAFE_HOT_KEYS = {"risk_percent","max_daily_loss_pct","min_confidence","adx_regime_filter","circuit_breaker"}

def _maybe_reload_config() -> None:
    global _config_mtime, config
    current_mtime = os.path.getmtime("config.json")
    if current_mtime <= _config_mtime:
        return
    new_cfg = json.loads(open("config.json").read())
    changed = {}
    for key in _SAFE_HOT_KEYS:
        if new_cfg.get(key) != config.get(key):
            changed[key] = {"old": config.get(key), "new": new_cfg.get(key)}
            config[key] = new_cfg[key]
    if changed:
        logger.info(f"[CONFIG RELOAD] {changed}")
        _notify_telegram(f"βš™οΈ Config reloaded: {list(changed.keys())}")
    _config_mtime = current_mtime
Tasks:
  • Add _maybe_reload_config() call at top of main loop (before signal gate) β€” call every cycle (M15 cadence means 15-min max lag is acceptable; no need for a background thread)
  • Define _SAFE_HOT_KEYS set as shown above; never iterate all config keys (would silently apply unsafe changes)
  • Log changed keys with old/new values (not just key names) β€” makes audit trail useful
  • Telegram notification on reload: send list of changed keys so operator knows the change took effect
  • Unit test: write a temp config.json with modified risk_percent, call _maybe_reload_config(), assert config["risk_percent"] updated and _config_mtime advanced

Phase 22 β€” Advanced Ensemble Architecture (XGBoost Β· RF Β· FT-Transformer Β· TFT Β· LSTM)

The current RF + XGB + LGB majority-vote ensemble leaves accuracy on the table because all three base learners are gradient-boosted trees β€” they share the same inductive bias and make correlated errors. Adding one neural-attention model and one sequence model provides genuine ensemble diversity (target pairwise disagreement rate 0.35–0.50), which lowers the irreducible error floor independent of individual model quality. Ensemble error decomposition:
Total Error = BiasΒ² + Variance βˆ’ 2 Γ— Covariance(model_i, model_j)
Adding a model that errs on different samples (low covariance) reduces total error even if the new model is weaker individually.
Build one model at a time. Retrain after each addition. Gate on OOS Score β‰₯ current before keeping. Adding all five models at once makes root-cause analysis impossible.
Python dependencies to install before starting this phase:
pip install torch>=2.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install skl2onnx>=1.16 onnxconverter-common>=1.13
pip install pytorch-forecasting>=1.0  # TFT reference implementation
pip install mapie>=0.8               # Conformal prediction / quantile intervals
pip install xgboost>=2.0             # DART mode requires >=1.7; 2.x preferred

22.1 XGBoost β€” DART mode + monotonic constraints + Optuna search space

Algorithm β€” DART tree dropping: At each boosting round, instead of using all t trees built so far, DART randomly drops a subset D βŠ† {1…t} and trains the new tree to compensate for the removed ones:
Ε·_i = Ξ£_{k βˆ‰ D} f_k(x_i)           # prediction without dropped trees
new tree f_{t+1} fits residuals of Ε·_i
prediction after round: Ε·_i + f_{t+1}(x_i) scaled by 1/|D|+1
rate_drop = probability each tree is included in D. skip_drop = probability the entire drop is skipped for that round (pure GBM step). Monotonic constraint math: For feature j with constraint c_j ∈ {-1, 0, +1}, XGBoost enforces:
if c_j = +1:  for all splits on feature j, left_child_value ≀ right_child_value
if c_j = -1:  left_child_value β‰₯ right_child_value
if c_j =  0:  unconstrained
Enforced during tree construction via post-order tree repair β€” each internal node’s value is clipped to [max(left_subtree), min(right_subtree)]. Interaction constraints: Define which feature groups may share a split path. XGBoost rejects any tree that routes both feature i and feature j on the same root-to-leaf path if they are in different groups:
"interaction_constraints": [
  [0, 1, 2, 3, 4],          // Group 0: momentum (RSI, MACD, rsi_slope, etc.)
  [5, 6, 7, 8, 9],          // Group 1: volatility (ATR, BB, ADX)
  [10, 11, 12, 13, 14, 15], // Group 2: structure (EMA, price_vs_ema, d1_trend)
  [16, 17, 18, 19, 20]      // Group 3: session/time (flags, sin/cos encodings)
]
Tasks:
  • Switch booster in ml_config.json β†’ xgb_params:
    "booster": "dart",
    "rate_drop": 0.10,
    "skip_drop": 0.50,
    "normalize_type": "tree",
    "learning_rate": 0.05,
    "n_estimators": 400,
    "max_delta_step": 1
    
  • In ml/train.py around the XGBClassifier constructor, build monotone_constraints tuple from model_compat.json["features"] order β€” map feature names to constraint values:
    MONOTONE_MAP = {
        "atr_14": 1, "adx_14": 1, "bb_width": 1,
        "volume_ratio": 1, "atr_percentile": 1,
    }  # all others default to 0
    constraints = tuple(MONOTONE_MAP.get(f, 0) for f in feature_names)
    
  • Build interaction_constraints list from 4 feature cluster groups; attach to XGB params before fit
  • Add DART-specific Optuna search space in ml/tune.py (new branch under if booster == "dart"):
    "rate_drop":  trial.suggest_float("rate_drop", 0.05, 0.30),
    "skip_drop":  trial.suggest_float("skip_drop", 0.30, 0.70),
    "normalize_type": trial.suggest_categorical("normalize_type", ["tree", "forest"]),
    "max_depth":  trial.suggest_int("max_depth", 4, 8),   # shallower than gbtree
    "subsample":  trial.suggest_float("subsample", 0.6, 0.9),
    "colsample_bytree": trial.suggest_float("colsample_bytree", 0.5, 0.8),
    
  • DART disables early_stopping_rounds β€” use fixed n_estimators=400; remove any early_stopping_rounds from the DART fit call in ml/train.py
  • Run python scripts/retrain.py --ensemble --no-warmstart to force a clean DART retrain
  • OOS gate: keep if Score β‰₯ current GBTREE baseline βˆ’ 0.5 (accept slight score trade-off for lower variance)

22.2 Random Forest β€” ExtraTrees + Quantile intervals for Kelly

Algorithm β€” ExtraTrees split selection: Standard RF: at each node, evaluate max_features candidate features and pick the split minimizing Gini. ExtraTrees: pick a random threshold from the feature’s observed range β€” no exhaustive search:
For each candidate feature j:
  threshold_j ~ Uniform(min(X[:,j]), max(X[:,j]))  # random, not optimal
Pick j* = argmin Gini over random (j, threshold_j) pairs
Result: higher bias per tree, dramatically lower variance across trees β€” beneficial on noisy M15 features. Algorithm β€” Quantile intervals via leaf distributions: Standard RF predicts Ε· = (1/T) Ξ£_t leaf_mean_t(x). Quantile RF instead collects all training labels in each matched leaf across all T trees, forming an empirical distribution, and returns its percentiles:
S(x) = βˆͺ_{t=1}^{T} { y_i : x_i ∈ leaf_t(x) }   # all leaf-matched labels
P_q(x) = q-th percentile of S(x)
Kelly fraction adjustment using interval width:
raw_kelly = (p_win βˆ’ p_loss) / win_loss_ratio      # standard Kelly
interval_width = P_95(x) βˆ’ P_05(x)                 # normalized 0–1
adjusted_kelly = raw_kelly Γ— (1 βˆ’ interval_width)  # tighter interval β†’ larger position
effective_risk  = base_risk_pct Γ— min(adjusted_kelly, max_kelly_fraction)
Tasks:
  • Add ExtraTreesClassifier to ml/ensemble_trainer.py β€” insert after the RF definition:
    from sklearn.ensemble import ExtraTreesClassifier
    extra_trees = ExtraTreesClassifier(
        n_estimators=350,
        max_depth=None,          # full depth; randomness controls variance
        min_samples_split=8,
        max_features='sqrt',
        bootstrap=False,         # ExtraTrees convention
        class_weight='balanced',
        random_state=42,
        n_jobs=-1,
    )
    
  • Export ExtraTrees to ONNX using the same skl2onnx pipeline in ml/onnx_export.py; output shape [1, 3] float32 probabilities
  • Add "extra_trees" key to model_compat.json["models"] list; update ensemble_predictor.py load path
  • Create ml/quantile_predictor.py:
    • Class QuantileRFPredictor wraps a trained RandomForestClassifier
    • Method predict_interval(X_row) β†’ iterates all tree leaves matched by X_row, pools their training labels, returns (p05, p50, p95) for each class
    • Inference is O(T Γ— leaf_size) β€” keep T ≀ 200 for <50 ms latency at M15 frequency
  • Wire into trading/bot.py Kelly sizing block (currently around line 1220): fetch interval_width from QuantileRFPredictor; apply adjusted Kelly formula above; log both raw and adjusted Kelly to ml_performance.csv
  • Add "quantile_rf": {"enabled": false, "p_low": 0.05, "p_high": 0.95} config block to config.json
  • Optuna search space additions in ml/tune.py for ExtraTrees:
    "et_n_estimators": trial.suggest_int("et_n_estimators", 200, 500),
    "et_min_samples_split": trial.suggest_int("et_min_samples_split", 4, 16),
    "et_max_features": trial.suggest_categorical("et_max_features", ["sqrt", "log2", 0.5]),
    
  • OOS sweep: compare 3-model majority-vote vs 4-model (RF+XGB+LGB+ExtraTrees) majority-vote; require Score β‰₯ current + 0.3

22.3 FT-Transformer (Feature Tokenizer + Transformer)

Architecture β€” Feature Tokenizer: Each of the 62 features is independently projected from scalar (batch, 1) β†’ embedding vector (batch, d):
token_j = W_j Γ— x_j + b_j + e_j      # W_j ∈ ℝ^d, b_j ∈ ℝ^d, e_j = feature-index embedding
A learnable [CLS] token is prepended, giving a 63-token sequence. Multi-head self-attention then computes pairwise interaction scores between every pair of feature tokens:
Attention(Q, K, V) = softmax( QK^T / √d_k ) V
Q = K = V = token_sequence Γ— W_{q,k,v}
The [CLS] token aggregates cross-feature information; its output is fed to the classification head. Why FT-Transformer > TabTransformer for NOVOSKY: TabTransformer applies attention only to categorical features (9 out of 59). FT-Transformer applies attention to all 59, making it better suited since 50+ features are numerical time-series derivatives. Data requirements:
Input dtype:  float32, shape (batch, 59) β€” same StandardScaler as RF/XGB
Output dtype: float32, shape (batch, 3) β€” raw logits (apply softmax at inference)
Training set: same X_train, y_train from ml/train.py
Min samples for stable attention: ~50k (you have ~135k βœ“)
Tasks:
  • Create ml/models/ft_transformer.py β€” pure torch.nn.Module:
    class FTTransformer(nn.Module):
        def __init__(self, n_features=59, d_model=64, n_heads=8, n_layers=6,
                     ffn_dim=256, dropout=0.1):
            # Feature Tokenizer: one Linear(1 β†’ d_model) per feature
            self.tokenizers = nn.ModuleList([nn.Linear(1, d_model) for _ in range(n_features)])
            # Feature-index positional embedding (not time-positional)
            self.feat_index_emb = nn.Embedding(n_features, d_model)
            # CLS token
            self.cls_token = nn.Parameter(torch.randn(1, 1, d_model))
            # Transformer encoder
            encoder_layer = nn.TransformerEncoderLayer(
                d_model, n_heads, ffn_dim, dropout, batch_first=True, norm_first=True
            )
            self.transformer = nn.TransformerEncoder(encoder_layer, n_layers)
            # Head: LayerNorm β†’ Linear β†’ 3 classes
            self.head = nn.Sequential(nn.LayerNorm(d_model), nn.Linear(d_model, 3))
    
    Forward pass: tokenize each feature β†’ add index embeddings β†’ prepend CLS β†’ transformer β†’ CLS output β†’ head
  • Create ml/trainers/ft_transformer_trainer.py:
    • Convert X_train (numpy float64) to torch.float32 tensor
    • class_weights = compute_sample_weight('balanced', y_train) β†’ torch.FloatTensor
    • loss = F.cross_entropy(logits, y_batch, weight=class_weights_batch)
    • Optimizer: AdamW(lr=5e-4, weight_decay=1e-5, betas=(0.9, 0.999))
    • Scheduler: CosineAnnealingLR(T_max=150, eta_min=1e-6) β€” cosine decay ensures smooth convergence
    • Batch size: 256; max epochs: 150; early stop patience: 15 on val cross-entropy
    • Save best checkpoint to models/ft_transformer.pt (state_dict only, not full model)
  • Add Optuna hyperparameter search for FT-Transformer in ml/tune.py:
    "ft_d_model":  trial.suggest_categorical("ft_d_model", [32, 64, 128]),
    "ft_n_heads":  trial.suggest_categorical("ft_n_heads", [4, 8]),
    "ft_n_layers": trial.suggest_int("ft_n_layers", 3, 8),
    "ft_ffn_dim":  trial.suggest_categorical("ft_ffn_dim", [128, 256, 512]),
    "ft_dropout":  trial.suggest_float("ft_dropout", 0.05, 0.30),
    "ft_lr":       trial.suggest_float("ft_lr", 1e-4, 1e-3, log=True),
    "ft_wd":       trial.suggest_float("ft_wd", 1e-6, 1e-3, log=True),
    
  • ONNX export β€” add to ml/onnx_export.py:
    dummy = torch.randn(1, 59)
    torch.onnx.export(
        model, dummy, "models/ft_transformer.onnx",
        input_names=["features"], output_names=["logits"],
        dynamic_axes={"features": {0: "batch"}, "logits": {0: "batch"}},
        opset_version=17,
    )
    
    Edge case: if n_heads does not divide d_model evenly, ONNX export fails β€” validate d_model % n_heads == 0 before export.
  • Apply isotonic calibration via existing calibrate_models() in ml/ensemble_trainer.py β€” wrap FT-Transformer inference in a sklearn-compatible predict_proba(X) adapter class
  • Add to ml/ensemble_predictor.py model loading: check models/ft_transformer.onnx β†’ fall back to models/ft_transformer_calibrated.pkl β†’ fall back to models/ft_transformer.pt
  • OOS gate: FT-Transformer solo OOS WR β‰₯ 50%, solo PF β‰₯ 1.5; ensemble with FT-Transformer Score β‰₯ baseline + 0.5

22.4 Temporal Fusion Transformer (TFT) β€” sequence model

Architecture overview: TFT processes a sequence of T=48 M15 bars. Each bar carries n_dyn=54 time-varying features. Additionally, 5 static features (session flags, day-of-week sin/cos) are processed separately.
Static features (5) ──→ Static Covariate Encoder (GRN)
                              β”‚
                              β”œβ”€β”€β†’ context_h (init LSTM hidden state)
                              └──→ context_e (enrichment context)

Dynamic features (48 Γ— 54) ──→ Variable Selection Network (VSN)
    β”‚  VSN uses a GRN per feature + softmax over features β†’ weighted sum
    β”‚  Output: (batch, 48, d_model)  β€” only informative features survive
    β”‚
    β”œβ”€β”€β†’ LSTM Encoder (2-layer, hidden=128)
    β”‚        Outputs: (batch, 48, 128) encoder states
    β”‚
    └──→ LSTM Decoder (2-layer, hidden=128) β€” initialized from static context_h
             Outputs: (batch, 48, 128) decoder states
             β”‚
             β”œβ”€β”€β†’ Gated Residual Network (GRN) with static context_e
             β”‚
             └──→ Multi-Head Attention (num_heads=4, causal mask off for classification)
                      β”‚
                      └──→ GRN β†’ Layer Norm β†’ (batch, 128)
                                       β”‚
                                       └──→ Linear(128, 3) β†’ BUY/SELL/HOLD logits
Gated Residual Network (GRN) β€” the core building block:
GRN(x, c=None):
  Ξ·β‚‚ = ELU( Wβ‚‚ Γ— [x; c] + bβ‚‚ )      # c = optional context vector
  η₁ = W₁ Γ— Ξ·β‚‚ + b₁
  gate = sigmoid( W_gate Γ— [x; c] + b_gate )
  output = LayerNorm( gate βŠ™ η₁ + (1βˆ’gate) βŠ™ x )  # gated skip connection
Variable Selection Network math:
VSN for timestep t with features x^(j)_t, j=1..54:
  ΞΎ^(j)_t = GRN_j( x^(j)_t, static_context )   # per-feature processing
  v_t = softmax( W_vs Γ— [ΞΎ^(1)_t; ...; ΞΎ^(54)_t] + b_vs )  # feature weights
  x̃_t = Σ_j v^(j)_t × ξ^(j)_t                              # weighted combination
The weights v_t are what we expose as β€œfeature attention” in the dashboard. Data pipeline β€” SequenceDataset:
# ml/data/sequence_dataset.py
class SequenceDataset(Dataset):
    def __init__(self, X: np.ndarray, y: np.ndarray, seq_len: int = 48,
                 static_indices: list[int] = None):
        # X: (N, 59) float32, y: (N,) int64
        # static_indices: positions of the 5 static features in X
        self.X = torch.from_numpy(X.astype(np.float32))
        self.y = torch.from_numpy(y.astype(np.int64))
        self.seq_len = seq_len
        self.static_idx = static_indices or [16, 17, 18, 19, 20]  # session/time features
        self.dyn_idx = [i for i in range(X.shape[1]) if i not in self.static_idx]

    def __len__(self):
        return len(self.X) - self.seq_len   # valid start indices

    def __getitem__(self, idx):
        seq = self.X[idx : idx + self.seq_len]           # (48, 59)
        x_dyn = seq[:, self.dyn_idx]                     # (48, 54)
        x_static = self.X[idx + self.seq_len - 1, self.static_idx]  # (5,) β€” current bar
        label = self.y[idx + self.seq_len - 1]           # label at bar t
        return x_dyn, x_static, label
No future lookahead: window [idx … idx+seq_len-1] β†’ label at idx+seq_len-1. Training config:
optimizer = AdamW(model.parameters(), lr=1e-3, weight_decay=1e-4)
scheduler = ReduceLROnPlateau(optimizer, patience=10, factor=0.5, min_lr=1e-5)
# Gradient clipping β€” mandatory for LSTM in TFT
torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
# Batch: 64 (sequence data needs smaller batches for gradient stability)
# Loss: CrossEntropyLoss with class weights
Tasks:
  • Create ml/models/tft.py β€” implement GRN, VSN, TFT classes as described above; keep d_model=128, n_heads=4, seq_len=48, n_static=5, n_dynamic=54
  • Create ml/data/sequence_dataset.py β€” SequenceDataset as specified; unit test: assert no label from future bars is in the window (check __getitem__ with known data)
  • Create ml/trainers/tft_trainer.py:
    • Build DataLoader(SequenceDataset(...), batch_size=64, shuffle=False) β€” do NOT shuffle (time-series ordering)
    • Training loop: forward β†’ loss β†’ backward β†’ clip_grad β†’ step β†’ scheduler.step(val_loss)
    • Save best model (val loss) to models/tft.pt; also save final-epoch attention weights tensor to models/tft_attention_cache.npy (shape (n_val, 48, 54)) for dashboard initialization
  • Add TFT Optuna search space in ml/tune.py:
    "tft_d_model":    trial.suggest_categorical("tft_d_model", [64, 128]),
    "tft_n_heads":    trial.suggest_categorical("tft_n_heads", [4, 8]),
    "tft_n_grn":      trial.suggest_int("tft_n_grn", 1, 3),   # GRN depth
    "tft_dropout":    trial.suggest_float("tft_dropout", 0.05, 0.25),
    "tft_seq_len":    trial.suggest_categorical("tft_seq_len", [24, 48, 96]),
    "tft_lr":         trial.suggest_float("tft_lr", 5e-4, 5e-3, log=True),
    
  • ONNX export: TFT has two inputs β€” export with input_names=["x_dynamic", "x_static"], shapes [batch, 48, 54] and [batch, 5]; opset 17; verify with onnxruntime.InferenceSession
  • In ml/ensemble_predictor.py: maintain a rolling buffer _seq_buffer: deque(maxlen=48) populated after each build_features() call; pass last 48 rows to TFT at inference
  • Edge case: if _seq_buffer has < 48 entries (bot just started), zero-pad the head and still run inference β€” TFT will output lower-confidence results until buffer fills
  • OOS gate: TFT solo WR β‰₯ 50%; ensemble Score β‰₯ current + 0.5

22.5 LSTM with Bahdanau attention + TCN alternative

Bahdanau (additive) attention β€” full formula: Given LSTM output sequence H = [h_1, …, h_T] and final hidden state h_T:
e_t = v^T Γ— tanh( W_a Γ— h_t + U_a Γ— h_T )    # alignment score at step t
Ξ±_t = exp(e_t) / Ξ£_{t'} exp(e_{t'})           # softmax over T steps
c   = Ξ£_t Ξ±_t Γ— h_t                            # context vector (weighted sum)
Ε·   = W_out Γ— [c ; h_T] + b_out               # concat context + final hidden
W_a ∈ ℝ^{daΓ—d}, U_a ∈ ℝ^{daΓ—d}, v ∈ ℝ^{da} are learned. da=64 (attention dim). TCN receptive field formula (use to choose dilation schedule): With n_layers dilated causal Conv1d layers, each with kernel_size=k and dilation d_i = 2^i:
Receptive field = 1 + Ξ£_{i=0}^{n-1} (kβˆ’1) Γ— 2^i = 1 + (kβˆ’1) Γ— (2^n βˆ’ 1)
For k=5, n=4: RF = 1 + 4 Γ— 15 = 61 bars  (covers 61 M15 bars = ~15h)
For k=5, n=5: RF = 1 + 4 Γ— 31 = 125 bars (covers 125 M15 bars = ~31h)
Choose n=4 (RF=61) to cover 12–15 h window with lower compute cost. TCN architecture:
Input: (batch, seq_len=48, n_features=59)
Transpose: (batch, 59, 48)  β€” Conv1d expects (batch, channels, length)

Layer 0: Conv1d(59β†’128, k=5, dilation=1, padding=4, causal)  β†’ (batch, 128, 48)
Layer 1: Conv1d(128β†’128, k=5, dilation=2, padding=8, causal) β†’ (batch, 128, 48)
Layer 2: Conv1d(128β†’128, k=5, dilation=4, padding=16, causal)β†’ (batch, 128, 48)
Layer 3: Conv1d(128β†’128, k=5, dilation=8, padding=32, causal)β†’ (batch, 128, 48)

Each layer: Conv1d β†’ WeightNorm β†’ ReLU β†’ Dropout(0.1) + residual projection if channels change

Take last time step: (batch, 128)
Linear(128, 64) β†’ ReLU β†’ Dropout(0.2) β†’ Linear(64, 3)
Causal padding formula: To ensure no future leakage, pad left by (kβˆ’1) Γ— dilation and slice off the right:
def causal_conv1d(x, conv, dilation):
    padding = (conv.kernel_size[0] - 1) * dilation
    x = F.pad(x, (padding, 0))   # left-pad only
    return conv(x)
Tasks:
  • Create ml/models/lstm_attention.py:
    • Bidirectional LSTM: nn.LSTM(input_size=59, hidden_size=128, num_layers=2, bidirectional=True, dropout=0.2, batch_first=True) β€” output dim = 256
    • Attention: implement Bahdanau equations above with da=64; W_a=Linear(256, 64), U_a=Linear(256, 64), v=Linear(64, 1, bias=False)
    • Context vector: c ∈ ℝ^{256}; concat with h_T[-1] (last step, bidirectional) β†’ Linear(512, 3)
    • Inference mode switch: set self.training_mode flag; when False, run only forward direction of LSTM (unidirectional causal)
  • Create ml/models/tcn.py:
    • 4 CausalConv1d layers with dilations [1, 2, 4, 8], kernel_size=5, out_channels=128
    • WeightNorm on each Conv1d (improves training stability vs BatchNorm on small batches)
    • Residual connections: if in_channels β‰  out_channels, add Conv1d(in, out, 1) projection
    • Final layer: last timestep output β†’ Linear(128, 3)
  • Train both on SequenceDataset(seq_len=48) using same tft_trainer.py loop (swap model); record OOS Score for each; keep whichever is higher (TCN likely within 0.5% but 4Γ— faster)
  • Optuna search space for LSTM:
    "lstm_hidden":    trial.suggest_categorical("lstm_hidden", [64, 128, 256]),
    "lstm_layers":    trial.suggest_int("lstm_layers", 1, 3),
    "lstm_dropout":   trial.suggest_float("lstm_dropout", 0.1, 0.4),
    "attn_dim":       trial.suggest_categorical("attn_dim", [32, 64, 128]),
    "lstm_lr":        trial.suggest_float("lstm_lr", 5e-4, 5e-3, log=True),
    
  • Optuna search space for TCN:
    "tcn_channels":   trial.suggest_categorical("tcn_channels", [64, 128, 256]),
    "tcn_n_layers":   trial.suggest_int("tcn_n_layers", 3, 6),
    "tcn_kernel":     trial.suggest_categorical("tcn_kernel", [3, 5, 7]),
    "tcn_dropout":    trial.suggest_float("tcn_dropout", 0.05, 0.30),
    
  • ONNX export β€” LSTM: use torch.onnx.export with opset 17; set do_constant_folding=True; verify hidden state output is not exported (classification output only); test inference latency with onnxruntime on a single row β€” must be < 20 ms on CPU
  • At inference in ensemble_predictor.py: feed same _seq_buffer used by TFT; if buffer < 48, pad with zeros (same strategy as TFT)
  • Attach alpha_weights (shape (48,)) to the return value of get_signal() for dashboard streaming

22.6 Stacked meta-learner + regime-adaptive weighting

Algorithm β€” walk-forward OOF stacking: To prevent data leakage (meta-learner seeing the test set during base-model training), use time-series walk-forward folds:
Fold 1:  Train base models on months 1-6   β†’ predict months 7-8   β†’ collect OOF_1
Fold 2:  Train base models on months 1-8   β†’ predict months 9-10  β†’ collect OOF_2
Fold 3:  Train base models on months 1-10  β†’ predict months 11-12 β†’ collect OOF_3
...
The meta-feature matrix X_meta has shape (N_train, K Γ— C) where:
  • K = number of base models (5: RF, XGB, LGB, ExtraTrees, FT-Transformer/LSTM/TFT)
  • C = number of classes (3: BUY, SELL, HOLD)
  • N_train = training samples with OOF predictions (unavoidably loses first fold’s samples)
Meta-learner loss:
L_meta = CrossEntropy( LGB_meta(X_meta), y_true )
Meta-LGB is kept shallow (max_depth=3, n_estimators=100) to prevent overfitting on KΓ—C=15 features. Isotonic calibration of meta output: After training, apply per-class isotonic regression on a held-out calibration fold:
from sklearn.isotonic import IsotonicRegression
for c in range(3):
    ir_c = IsotonicRegression(out_of_bounds='clip')
    ir_c.fit(meta_probs_val[:, c], (y_val == c).astype(float))
    calibrated_probs[:, c] = ir_c.transform(meta_probs_test[:, c])
# Renormalize rows to sum to 1
calibrated_probs /= calibrated_probs.sum(axis=1, keepdims=True)
Regime router β€” adaptive weighting formula: When meta-learner confidence max(meta_probs) is < 0.55, fall back to regime-weighted averaging:
conf_threshold = 0.55
if max(meta_probs) >= conf_threshold:
    final_probs = meta_probs              # trust the meta-learner
else:
    w = regime_weights[current_regime]   # per-regime weight dict
    final_probs = Ξ£_k w_k Γ— base_probs_k
    final_probs /= final_probs.sum()      # renormalize
Regime detection (computed from current _latest_features_cache):
def detect_regime(features: dict) -> str:
    adx = features["adx_14"]
    atr_pct = features["atr_percentile"]
    if adx > 0.25 and atr_pct < 0.75:  return "STRONG_TREND"
    if adx < 0.20 and atr_pct < 0.40:  return "RANGING"
    if atr_pct >= 0.75:                 return "VOLATILE"
    return "CHOPPY"
Tasks:
  • Create ml/meta_learner.py with:
    • generate_oof_stacks(base_models, X, y, n_splits=5) β†’ returns X_meta (N, 15), y_meta (N,) using TimeSeriesSplit
    • Algorithm: for each fold, retrain all 5 base models on train split, predict predict_proba on val split, store in X_meta[val_idx]
    • Save to models/oof_stacks.npy
    • train_meta_learner(X_meta, y_meta) β†’ fits LGBMClassifier(n_estimators=100, max_depth=3, learning_rate=0.1, num_leaves=15, min_child_samples=20), saves to models/meta_learner.pkl
    • calibrate_meta(meta_model, X_cal, y_cal) β†’ fits 3 IsotonicRegression objects, saves to models/meta_calibrators.pkl
  • Create ml/regime_router.py:
    • detect_regime(features: dict) β†’ str β€” 4-state classification using ADX + ATR percentile as above
    • REGIME_WEIGHTS dict with per-regime model weights (initial values: tune via scripts/sweep.py --target regime)
    • route(meta_probs, base_probs_dict, features) β†’ np.ndarray β€” implements the fallback formula above
  • Update ml/ensemble_predictor.py:
    • Load meta_learner.pkl and meta_calibrators.pkl in __init__ (alongside existing model loading)
    • In get_signal(): collect all base model predict_proba outputs β†’ stack into X_meta_row (1, 15) β†’ run meta_learner.predict_proba β†’ calibrate β†’ pass to RegimeRouter.route()
  • Update scripts/weekly_optimize.py β€” add Phase 13b: after base model retraining, regenerate OOF stacks and retrain meta-learner (takes ~5 min extra; acceptable in weekly job)
  • Optuna search space for meta-learner itself:
    "meta_n_est":       trial.suggest_int("meta_n_est", 50, 200),
    "meta_max_depth":   trial.suggest_int("meta_max_depth", 2, 5),
    "meta_lr":          trial.suggest_float("meta_lr", 0.05, 0.30),
    "meta_num_leaves":  trial.suggest_int("meta_num_leaves", 7, 31),
    "meta_conf_thresh": trial.suggest_float("meta_conf_thresh", 0.50, 0.65),
    
  • OOS gate: 5-model meta-learner must beat current 3-model majority-vote by β‰₯ 1.0 Score AND β‰₯ 2% WR; if not, revert to majority-vote and document result in logs/meta_learner_eval.log

Phase 23 β€” JARVIS Live Trading Dashboard

A real-time visualization system inspired by quant firm internal dashboards (Bloomberg DASH, QuantConnect live monitor, Two Sigma’s internal regime displays) and Jarvis-style AI interface aesthetics. Stack: Next.js + WebSocket + TradingView Lightweight Charts + Framer Motion + Apache ECharts + optional Three.js.
This is read-only. The dashboard connects to a new WebSocket endpoint on the bot server and never writes to config.json, triggers orders, or modifies bot state.
Frontend dependencies (add to package.json):
npm install framer-motion@11        # animation engine
npm install lightweight-charts@4    # professional candlestick chart
npm install recharts@2              # bar/area charts
npm install echarts@5 echarts-for-react@3   # heatmap
npm install @react-three/fiber@8 @react-three/drei@9 three@0.165  # 3D surface (stretch)
npm install @supabase/supabase-js@2  # already installed
npm install zustand@4               # lightweight global state (signal stream store)
Backend dependencies (add to requirements.txt):
fastapi>=0.111
uvicorn[standard]>=0.29
websockets>=12.0
shap>=0.45          # live SHAP values per signal

23.1 WebSocket signal stream (backend)

Message contract β€” SignalEvent (full schema):
interface SignalEvent {
  ts: string;              // ISO 8601 UTC timestamp
  prediction: "BUY" | "SELL" | "HOLD";
  confidence: number;      // max(probs) after calibration
  prob_diff: number;       // probs[0] - probs[1] (margin of victory)
  probs: [number, number, number];  // [BUY, SELL, HOLD]
  model_votes: {
    rf: "BUY" | "SELL" | "HOLD";
    xgb: "BUY" | "SELL" | "HOLD";
    lgb: "BUY" | "SELL" | "HOLD";
    ft_transformer?: "BUY" | "SELL" | "HOLD";  // optional until Phase 22.3 ships
    lstm?: "BUY" | "SELL" | "HOLD";
  };
  model_confidences: Record<string, number>;  // per-model max prob
  top_shap: Array<{ name: string; value: number }>;  // top 10, signed
  attention_weights?: number[];   // 48 floats from TFT/LSTM (nullable)
  regime: "STRONG_TREND" | "RANGING" | "VOLATILE" | "CHOPPY";
  adx: number;
  atr_percentile: number;
  equity: number;            // current account equity (raw USC)
  open_position: {
    ticket: number;
    direction: "BUY" | "SELL";
    entry: number;
    sl: number;
    tp: number;
    unrealized_pnl: number;
    bars_held: number;
  } | null;
  ohlcv: {  // current completed M15 bar
    time: number;  // Unix timestamp
    open: number; high: number; low: number; close: number; volume: number;
  };
}
Tasks:
  • Create trading/ws_server.py β€” FastAPI app on port 8765:
    app = FastAPI()
    _signal_queue: asyncio.Queue[SignalEvent] = asyncio.Queue(maxsize=10)
    
    @app.websocket("/ws/signal")
    async def stream(ws: WebSocket, token: str = Query(...)):
        if token != os.getenv("DASHBOARD_WS_SECRET"):
            await ws.close(code=4001); return
        await ws.accept()
        try:
            while True:
                event = await _signal_queue.get()
                await ws.send_json(dataclasses.asdict(event))
        except WebSocketDisconnect:
            pass
    
  • In trading/bot.py after _get_signal() returns (around line 3200): push SignalEvent to _signal_queue via asyncio.get_event_loop().call_soon_threadsafe(_signal_queue.put_nowait, event) β€” bot runs in a thread, ws_server in asyncio event loop
  • Run ws_server.py via uvicorn in a background thread started at bot startup; add DASHBOARD_WS_SECRET to .env
  • Add to ecosystem.config.js: second PM2 process ws_server using uvicorn trading.ws_server:app --port 8765
  • Caddy config: add reverse_proxy /ws/signal localhost:8765 under terminal-rf1.novosky.app block
  • SHAP computation: after each build_features() call, compute shap.TreeExplainer(lgb_model).shap_values(X_current)[signal_class] β€” takes ~20 ms on CPU; acceptable at M15 frequency; include top 10 by abs value in top_shap

23.2 Core dashboard layout (Next.js)

File structure:
app/dashboard/
  page.tsx             # root page β€” imports all panels
  layout.tsx           # standalone dark layout (no site nav/footer)
  loading.tsx          # skeleton loader while WS connects
components/dashboard/
  ConfidenceMeter.tsx
  ModelVotingPanel.tsx
  ModelConfidenceBars.tsx
  RegimeIndicator.tsx
  FeatureImportance.tsx
  TradeFlowPipeline.tsx
  CandlestickChart.tsx
  AttentionHeatmap.tsx
  EquityPanel.tsx
  EquitySurface3D.tsx  # stretch
hooks/
  useSignalStream.ts   # WebSocket client + store
  useEquityHistory.ts  # Supabase historical equity query
stores/
  signalStore.ts       # Zustand store for latest signal state
useSignalStream.ts β€” reconnect logic:
export function useSignalStream(url: string) {
  const setSignal = useSignalStore(s => s.setSignal);
  const wsRef = useRef<WebSocket | null>(null);
  const reconnectDelay = useRef(1000);

  const connect = useCallback(() => {
    const ws = new WebSocket(`${url}?token=${process.env.NEXT_PUBLIC_WS_SECRET}`);
    ws.onmessage = (e) => {
      setSignal(JSON.parse(e.data));
      reconnectDelay.current = 1000;  // reset backoff on success
    };
    ws.onclose = () => {
      setTimeout(connect, reconnectDelay.current);
      reconnectDelay.current = Math.min(reconnectDelay.current * 2, 30_000);
    };
    wsRef.current = ws;
  }, [url, setSignal]);

  useEffect(() => { connect(); return () => wsRef.current?.close(); }, [connect]);
}
Tasks:
  • Create app/dashboard/layout.tsx with className="min-h-screen bg-slate-950 text-slate-100 font-mono" β€” separate from main site layout; no nav bar
  • app/dashboard/page.tsx: CSS Grid layout β€” grid-cols-[40%_60%] on desktop, single column on mobile; gap-4; all panels inside <Suspense> boundaries
  • stores/signalStore.ts: Zustand store with fields signal: SignalEvent | null, history: SignalEvent[] (last 200), connected: boolean; setSignal appends to history and updates latest
  • Throttle store updates: wrap setSignal with a 250 ms debounce (4 Hz max re-render rate)
  • Connection status pill: top-right corner, 8px dot β€” animate-pulse green when connected; amber when reconnecting; static red when disconnected for > 10 s

23.3 Animated confidence meter

Radial arc implementation using SVG + Framer Motion: The arc is drawn as an SVG <path> using polar-to-Cartesian conversion:
function polarToCartesian(cx, cy, r, angleDeg) {
  const rad = (angleDeg - 90) * Math.PI / 180;
  return { x: cx + r * Math.cos(rad), y: cy + r * Math.sin(rad) };
}

function arcPath(cx, cy, r, startDeg, endDeg) {
  const s = polarToCartesian(cx, cy, r, startDeg);
  const e = polarToCartesian(cx, cy, r, endDeg);
  const large = endDeg - startDeg > 180 ? 1 : 0;
  return `M ${s.x} ${s.y} A ${r} ${r} 0 ${large} 1 ${e.x} ${e.y}`;
}
// Usage: arc from -135Β° to (-135Β° + confidence Γ— 270Β°) β†’ covers 270Β° total sweep
Tasks:
  • ConfidenceMeter.tsx: SVG-based radial arc; background arc (dark stroke) + foreground arc animated with motion.path and animate={{ pathLength: confidence }} (Framer Motion SVG animation); center text shows percentage
  • Spring config: transition={{ type: "spring", stiffness: 120, damping: 20 }} on pathLength change β€” avoids linear snap, feels organic
  • On new signal: trigger outer ring pulse using useAnimate:
    const [scope, animate] = useAnimate();
    useEffect(() => {
      if (signal) animate(scope.current, { scale: [1, 1.4, 1], opacity: [1, 0.3, 1] },
                          { duration: 0.6, repeat: 2 });
    }, [signal?.ts]);
    
  • Color: derive from signal.prediction β€” emerald-400 (BUY), rose-400 (SELL), amber-400 (HOLD); use CSS variable for smooth color transition via motion.div animate={{ color }} with transition={{ duration: 0.3 }}
  • ModelConfidenceBars.tsx: horizontal progress bars per model using motion.div; set the animate width to confidence * 100 percent as a string value, transition={{ duration: 0.25 }}

23.4 Model voting panel

Tasks:
  • ModelVotingPanel.tsx: 5-card grid (grid-cols-5 gap-3); each card is a motion.div with layout prop (enables FLIP animation on reorder); background color set via animate={{ backgroundColor }} β€” Framer Motion handles color interpolation
  • Scale pop on vote change: track prevVote in useRef; if vote !== prevVote, trigger animate={{ scale: [1, 1.15, 1] }, { duration: 0.2 }}
  • Consensus glow: when all 5 models agree β€” animate={{ boxShadow: "0 0 24px #10b981" }} (emerald for BUY) with transition={{ repeat: Infinity, repeatType: "reverse", duration: 1.2 }}
  • Split signal badge: if Object.values(votes).filter(v => v === prediction).length <= 2, render amber ⚠ Split badge using AnimatePresence for enter/exit animation (slide down from top)
  • Majority fraction badge: "4 / 5 BUY" string derived from vote counts; update without animation (content only)

23.5 Market regime indicator

Regime transition math: Regime changes should feel deliberate, not flickering. Apply a hysteresis rule: only switch regime if the new regime persists for 3 consecutive signals (45 min at M15 frequency):
const regimeBuffer = useRef<string[]>([]);
const confirmedRegime = useSignalStore(s => s.signal?.regime);

useEffect(() => {
  regimeBuffer.current.push(confirmedRegime);
  if (regimeBuffer.current.length > 3) regimeBuffer.current.shift();
  const dominant = mode(regimeBuffer.current);  // most frequent in last 3
  if (dominant !== displayedRegime) setDisplayedRegime(dominant);
}, [confirmedRegime]);
Tasks:
  • RegimeIndicator.tsx: outer AnimatePresence mode="wait" β€” exit old card (opacity 0, y -20) then enter new card (opacity 1, y 0); transition={{ duration: 0.4 }}
  • Glow border: animate={{ boxShadow: glowColor }} β€” map regime to glow: STRONG_TRENDβ†’"0 0 30px #10b981", RANGINGβ†’"0 0 30px #3b82f6", VOLATILEβ†’"0 0 30px #ef4444", CHOPPYβ†’"0 0 30px #eab308"
  • Stats row: ADX value, ATR percentile (as %), recent WR from last 20 signals in signalStore.history; formatted as ADX 0.31 Β· ATR p72 Β· WR 68%
  • Mini sparkline: <AreaChart width={120} height={40} data={adxHistory}> β€” no axes, no labels, just the shape; <Area dataKey="adx" stroke="#94a3b8" fill="transparent" strokeWidth={1.5} />

23.6 Live feature importance bar chart

SHAP value sign convention: positive SHAP means the feature pushed the model toward prediction class; negative means it pulled away. Color accordingly. Tasks:
  • FeatureImportance.tsx: <BarChart layout="vertical" width={380} height={300}> β€” horizontal bars; <Bar dataKey="value" animationDuration={300} animationEasing="ease-out"> with <Cell fill={v > 0 ? "#14b8a6" : "#f43f5e"} /> per bar
  • Sort top 10 by Math.abs(shap) descending; truncate feature name to 18 chars; <Tooltip formatter={(v) => v.toFixed(4)} />
  • On update: Recharts re-renders with animationDuration=300 automatically animates bar width changes β€” no extra work needed
  • Tabs component (shadcn/ui <Tabs>): tab 1 = SHAP, tab 2 = Attention (disabled/grayed until Phase 22.4 ships); when tab 2 is unlocked, render AttentionHeatmap inline

23.7 Trade flow animation

State machine β€” 6 nodes, transitions triggered by WebSocket events:
IDLE ──[new signal fires]──→ SCAN ──[confidence > threshold]──→ SIGNAL
     ──[risk check pass]──→ RISK CHECK ──[lot calculated]──→ SIZE
     ──[order sent]──→ EXECUTE ──[order confirmed]──→ MONITOR
     ──[position closes]──→ IDLE
Tasks:
  • TradeFlowPipeline.tsx: horizontal node list with SVG connecting lines; each node is a 40px circle + label below
  • Active node animation: motion.div animate={{ rotate: 360 }} transition={{ repeat: Infinity, duration: 2, ease: "linear" }} on the outer ring; inner circle static
  • Node-to-node hop: use useEffect on signal.open_position to advance state machine; 200 ms delay between hops via sequential setTimeout calls (not sleep β€” use Promise chain)
  • Connecting line fills as each node activates: motion.line animate={{ pathLength: isComplete ? 1 : 0 }} transition={{ duration: 0.2 }}
  • Close event: subscribe to Supabase trades table INSERT; on INSERT, determine close type from close_type column β†’ pulse MONITOR node emerald (TP / ML_EXIT) or rose (SL_HIT) with 3Γ— scale keyframe, then reset state machine to IDLE after 2 s

23.8 TradingView Lightweight Charts integration

Tasks:
  • CandlestickChart.tsx: initialize chart in useEffect with cleanup; use useRef<IChartApi> for the chart instance to survive re-renders
    const chart = createChart(containerRef.current, {
      layout: { background: { color: '#020617' }, textColor: '#94a3b8' },
      grid: { vertLines: { color: '#1e293b' }, horzLines: { color: '#1e293b' } },
      rightPriceScale: { borderColor: '#334155' },
      timeScale: { borderColor: '#334155', timeVisible: true },
      width: containerRef.current.clientWidth,
      height: 420,
    });
    
  • On mount: fetch last 200 M15 bars from GET /api/bars?symbol=BTCUSD&tf=M15&limit=200 (add this Next.js API route that proxies to the MT5 HTTP API)
  • On each SignalEvent: append signal.ohlcv to candleSeries.update() β€” Lightweight Charts handles the scrolling automatically
  • BUY/SELL markers: accumulate markers array from signalStore.history; call candleSeries.setMarkers(markers) on each update β€” Lightweight Charts re-renders only changed markers
  • SL/TP lines: use chart.addLineSeries({ lineStyle: LineStyle.Dashed }) for SL (rose) and TP (emerald); update price on position change; series.setData([]) when no position open
  • Confidence histogram panel: add chart.addHistogramSeries({ priceScaleId: 'confidence', height: 80 }) β€” maps signal.confidence to bar height; color by direction
  • ResizeObserver: watch container width changes β†’ chart.applyOptions({ width: newWidth }) for responsiveness

23.9 Neural attention heatmap

ECharts heatmap config:
const option = {
  tooltip: { formatter: ({ data }) => `Bar ${data[0]}: ${data[1]} = ${data[2].toFixed(3)}` },
  xAxis: { type: 'category', data: barLabels,   // e.g., ["-48", "-47", ..., "0"]
            axisLabel: { interval: 7 } },
  yAxis: { type: 'category', data: featureNames.slice(0, 10) },
  visualMap: { min: 0, max: 1, calculable: true,
                inRange: { color: ['#1e3a5f', '#2563eb', '#ef4444'] } },  // blue→red
  series: [{ type: 'heatmap', data: flatData,
              itemStyle: { borderRadius: 2, borderWidth: 0.5, borderColor: '#0f172a' } }],
};
Stagger animation:
// On new signal, reset and re-animate each column with increasing delay
flatData.forEach((cell, i) => {
  const col = cell[0];
  setTimeout(() => {
    updateCell(col, cell[1], cell[2]);
  }, col * 8);  // 8 ms per column Γ— 48 columns = 384 ms total stagger
});
Tasks:
  • AttentionHeatmap.tsx: use echarts-for-react wrapper; pass option as prop; style={{ height: 280 }}; update option via useState triggered by signal.attention_weights
  • Transform attention_weights: number[48] into flatData: for each of the top 10 features (by SHAP), duplicate the bar-level attention weight β€” weight is the same per bar regardless of feature (TFT VSN gives per-feature weights separately; show VSN weights on Y axis if available)
  • Show placeholder <div className="...">Attention available after Phase 22.4</div> if signal.attention_weights is null

23.10 Equity curve + live P&L panel

Live P&L tick calculation:
const TICK_MS = 100;
const lotSize = openPosition?.lot ?? 0;
const direction = openPosition?.direction === 'BUY' ? 1 : -1;

useEffect(() => {
  if (!openPosition) return;
  const interval = setInterval(() => {
    // Estimate tick P&L from last known close price in latest signal
    const currentPrice = latestSignal?.ohlcv?.close ?? openPosition.entry;
    const priceDiff = (currentPrice - openPosition.entry) * direction;
    const pnl = priceDiff * lotSize * 100;  // 100 USC per lot per point (BTCUSD CFD)
    setLivePnl(pnl);
  }, TICK_MS);
  return () => clearInterval(interval);
}, [openPosition, latestSignal]);
Tasks:
  • EquityPanel.tsx: <AreaChart> from Recharts with gradient fill (<defs><linearGradient> β€” emerald above baseline, transparent below); data from useEquityHistory hook (Supabase query SELECT equity, created_at FROM account_snapshots ORDER BY created_at DESC LIMIT 2000)
  • useEquityHistory.ts: initial query on mount + Supabase realtime subscription supabase.channel('account_snapshots').on('postgres_changes', { event: 'INSERT' }, handler)
  • Live P&L counter: <motion.span animate={{ color: livePnl >= 0 ? '#10b981' : '#f43f5e' }}> with transition={{ duration: 0.15 }}; value formatted as +$123.45 / -$12.30 (raw USC with $ prefix per CLAUDE.md)
  • Today’s stats row: derive from signalStore.history β€” trades_today = history.filter(s => sameDay(s.ts)), wr_today = wins/trades_today, gross_pnl_today = sum of closed trade pnl from Supabase
  • Last 5 trades table: columns Dir Β· Conf Β· P&L Β· Close Type; ML_EXIT shown in emerald, SL_HIT in rose, TP_HIT in teal; sorted by close time descending

23.11 (Stretch) 3D equity surface β€” Three.js / R3F

Surface data construction: Run python scripts/surface_sweep.py (new script) that iterates confidence_threshold ∈ [0.55, 0.60, …, 0.85] Γ— rolling 90-day windows and records cumulative_return at each point. Output: models/surface_data.json with shape (n_thresholds, n_days). Geometry construction:
// surface[i][j] = equity at threshold i, day j
// Normalize: x = j / n_days, z = i / n_thresholds, y = equity / max_equity

const geometry = new THREE.PlaneGeometry(10, 10, n_days - 1, n_thresholds - 1);
surface.flat().forEach((equity, idx) => {
  geometry.attributes.position.setY(idx, (equity / maxEquity) * 3);  // Y = height
});
geometry.computeVertexNormals();  // for lighting

// Vertex colors (jet colormap)
const colors = surface.flat().map(e => jetColor(e / maxEquity));
geometry.setAttribute('color', new THREE.BufferAttribute(new Float32Array(colors.flat()), 3));
Tasks:
  • scripts/surface_sweep.py: runs backtest.py --oos-only in a subprocess for each confidence threshold (7 values Γ— 1 sweep = ~15 min total); outputs models/surface_data.json
  • components/dashboard/EquitySurface3D.tsx using @react-three/fiber:
    • <Canvas camera={{ position: [8, 5, 8], fov: 45 }}> + <OrbitControls autoRotate autoRotateSpeed={0.5} />
    • Mesh: PlaneGeometry with vertex colors (jet colormap), MeshPhongMaterial({ vertexColors: true, wireframe: false })
    • Wireframe overlay: same geometry with MeshBasicMaterial({ wireframe: true, color: '#1e293b', opacity: 0.3 })
    • Lighting: <ambientLight intensity={0.4} /> + <directionalLight position={[10, 10, 5]} />
    • Axis labels as <Text> sprites from @react-three/drei
  • Gate behind ?surface=1 URL param; add toggle button in dashboard header; lazy-import component to avoid bundling Three.js by default

Full quarterly roadmap β†’ Roadmap.

Completed phases (1–12)

Phase 1 β€” Fix the model

  • SELL recall improved 3% β†’ 30% by adding 5 directional features: ema_stack, candle_direction, volume_delta, rsi_slope, consecutive_direction
  • TP=0.3% / SL=0.25% tuning: WR 60.6%β†’77.3%, PF 1.48β†’2.85, MaxDD 12%β†’6.3%
  • Training data extended 365 β†’ 730 days; lookahead_candles 12 β†’ 24
  • LightGBM feature-name warning fixed (pass numpy array directly)

Phase 2 β€” Multi-timeframe features

  • H4/D1 features resampled from M15: h4_ema_bias, h4_rsi_norm (#5 SHAP), h4_macd_dir, d1_trend, price_vs_d1_open (#7 SHAP)
  • Session flags: is_london_session, is_ny_session, is_asian_session, session_hour_sin/cos
  • Volatility/momentum: atr_percentile (top feature), volume_surge, bb_squeeze, price_acceleration
  • S/R proximity: dist_to_round_number (#1 SHAP overall), near_daily_high_low, adx_14, market_quality, momentum_decay, adverse_candle_ratio
  • ATR-aware labeling groundwork: create_labels_atr_aware() added (activated in Phase 10)

Phase 3 β€” Hyperparameter tuning + SHAP

  • Optuna 50-trial tuning: WF 44.9%β†’45.7%, PF 2.82β†’2.96, MaxDD 7.6%β†’3.8%, Sharpe 7.23β†’8.63
  • SHAP analysis: ml/shap_analysis.py with TreeExplainer; beeswarm plots; models/shap_summary.json
  • Top SHAP features: h4_rsi_norm > atr_14 > hourly_return > price_vs_ema200 > session_hour_sin
  • Feature pruning tested and rejected β€” removing low-SHAP features hurt performance (WR 55.9%β†’50%)
  • Sequence model deferred (CPU-only hardware too slow)

Phase 4 β€” Risk management (superseded)

Items superseded by ML-driven config sweeps in Phases 9/13/14.
  • EMA trend filter, partial profit taking, trailing stop β€” all superseded by ML active management
  • max_weekly_drawdown_pct added to config.json (2026-04-11)

Phase 5 β€” Backtesting

  • backtest.py built: config-faithful OOS backtester with ONNX inference; WR, PF, Sharpe, MaxDD, Return
  • v8 results (1yr OOS, 48 features): Setup A WR=76.4% PF=3.20 MaxDD=2.6% | Setup D Return=+747%
  • Walk-forward backtest deferred to Phase 16

Phase 6 β€” Automation & monitoring

  • Signal logging to models/ml_performance.csv
  • Performance monitoring via weekly walk-forward OOS gate (weekly_optimize.py)
  • Live alerts via trading/telegram_commands.py + scripts/notify.py

Phase 7 β€” Live trading integration

  • Migrated trading.py from MetaTrader5 Python package to NOVOSKY HTTP API
  • --dry flag added for safe testing
  • Paper trade validated via OOS backtest: WR=82.7%, PF=4.27, MaxDD=5.2%

Phase 8 β€” ML active trade management (2026-04-10)

  • Dedicated position model: RF+XGB+LGB on 63 features (59 market + 4 position-state)
  • Labels: HOLD / EXIT / ADD β€” ml/position_labeling.py + ml/position_trainer.py
  • PositionPredictor.get_position_action() with 2/3 majority vote
  • Kelly-adjusted lot sizing, ML-based SL/TP scaling, partial close, trailing stop
  • Results: Signal WF=43.66%, Position ensemble=73.50% | Setup E: WR=78.8% PF=4.52 MaxDD=1.6%

Phase 9 β€” Growth config sweep (2026-04-10)

  • Goal: maximize monthly return for $10k account, IC Markets RAW
  • Best: conf=0.55, risk=20%, max_lot=10, all Phase 8 active management disabled
  • Result: WR=57.4%, PF=1.75, MaxDD=15.0%, Sharpe=4.68, Return=+8449%, ~340 trades/yr

Phase 10 β€” Deep optimization (2026-04-11)

Four critical bugs fixed and retrained:
  1. Label-execution mismatch β€” activated atr_aware labeling in ml_config.json
  2. Class imbalance β€” replaced downsample with compute_sample_weight('balanced')
  3. Spread underestimation β€” backtest.py spread fallback 0.30β†’0.30 β†’ 14.59
  4. ADX regime filter β€” new adx_regime_filter block in config.json
Retrain result: WR 48.8%β†’78.6%, PF 1.15β†’2.91, MaxDD 59.8%β†’22.1%, Sharpe 2.71β†’14.11

Phase 11 β€” M15 scalping + local timezone support (2026-04-11)

  • H1 β†’ M15 timeframe; 112 trades/yr β†’ 477 trades/yr (1.31/day)
  • ATR-aware labels: SL=0.8Γ—ATR, TP=1.5Γ—ATR, lookahead=48 bars
  • sl_atr_multiplier 1.0β†’0.8; min_atr 50β†’15; ADX filter disabled
  • Local timezone support via config.json; Telegram redesign
  • Backtest: WR=63.7%, PF=3.05, MaxDD=19.6%, Sharpe=7.29, Return=+1,284,866%

Phase 12 β€” Production hardening (2026-04-12)

15 critical issues resolved:
  • API retry with exponential backoff (3Γ— on network errors)
  • Atomic state.json writes via .tmp β†’ os.replace()
  • tracked_positions persisted to state.json β€” survives PM2 restarts
  • risk_percent 6β†’2; max_consecutive_losses 10β†’5; max_weekly_drawdown_pct 0β†’20
  • Full retrain: fresh 2yr M15 data, Optuna 50-trial local tuning, OOS backtest