System diagram
Three-model stack
NOVOSKY uses three ML models that each handle a distinct responsibility:| Model | Type | Input | Output | Purpose |
|---|---|---|---|---|
| Signal | RF + XGB + LGB ensemble | 56 market features | BUY / SELL / HOLD | Decides whether to open a trade |
| Position | RF + XGB + LGB ensemble | 56 features + 4 position-state | HOLD / EXIT / ADD | Manages open positions |
| Risk | LightGBM regression | 7 equity-state features | multiplier [0.10–1.25] | Scales lot size based on account health |
multiplier = 1.0 with no effect.
All three models are retrained together in the weekly pipeline: python train_ml_model.py --ensemble --position --risk.
External services
| Service | Purpose |
|---|---|
| Cloudflare R2 | Model binary storage (.pkl, .onnx, .json). Auto-pulled at startup if missing. |
| Supabase (PostgreSQL) | Cloud trade dashboard — bot_status, trades, open_positions, position_events, news_events, account_snapshots, model_metrics. |
| Telegram | Real-time notifications for trades, daily reports, and risk events. |
| Local trainer | GPU-first local retraining with automatic CPU fallback. |
ML training pipeline
Repository layout
trading.py→trading/bot.pybacktest_config.py→backtest/run.pytrain_ml_model.py→ml/train.py
Key design decisions
| Decision | Reason |
|---|---|
| ONNX inference at runtime | 10–50× faster than sklearn/xgb/lgb native predict |
| No PyTorch / deep learning | Hardware focus: Xeon Platinum, sklearn/XGBoost/LightGBM stack is sufficient |
| No MT5 Python package | Self-hosted REST API avoids Windows dependency on the trading host |
| Warmstart training by default | Incremental training preserves accumulated model knowledge |
model_compat.json as source of truth | Single manifest to detect feature mismatches across all components |
| All config in JSON, not hardcoded | Runtime-swappable without retraining or redeployment |
| Risk model additive, never blocking | Separates sizing decisions from signal decisions — no model conflicts |
| Risk model fallback = 1.0 | Graceful degradation: system works correctly before the risk model is trained |