Documentation Index
Fetch the complete documentation index at: https://docs.novosky.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Feature engineering is the first step in both training and inference.ml/feature_engineering.py transforms raw OHLCV candles into the 62-feature vector consumed by all four models.
The feature list is a contract. The training pipeline, inference pipeline, and ml_config.json must all reference the same 62 features in the same order. Any change requires retraining all 4 models.
Feature groups
Bollinger Bands (4 features)
Bollinger Bands (4 features)
| Feature | Description |
|---|---|
bb_upper | Upper Bollinger Band (20-period, 2Ο) |
bb_lower | Lower Bollinger Band |
bb_width | Band width normalized by mid β measures squeeze/expansion |
range_position | Close position within the band: 0 = at lower, 1 = at upper |
Trend (8 features)
Trend (8 features)
| Feature | Description |
|---|---|
price_vs_ema200 | (Close β EMA200) / Close β trend direction and strength |
trend_strength | ADX-inspired directional strength |
macd_signal | MACD signal line (9-period EMA of MACD) |
macd_hist | MACD histogram (MACD β signal) |
ema_crossover | EMA50/EMA200 crossover state: +1, 0, β1 |
ema_fast | EMA fast line (period from config) |
ema_slope_5 | 5-bar slope of EMA20 |
higher_tf_trend | H4 timeframe trend direction |
Momentum (6 features)
Momentum (6 features)
| Feature | Description |
|---|---|
rsi_14 | RSI(14), normalized to [0, 1] |
rsi_divergence | Bullish/bearish RSI divergence vs price |
momentum_5 | 5-bar log return |
momentum_10 | 10-bar log return |
rate_of_change | 14-bar % price change |
close_vs_open | Current candle body direction and size |
Volatility (5 features)
Volatility (5 features)
| Feature | Description |
|---|---|
atr_14 | ATR(14) β absolute pip distance |
atr_ratio | ATR normalized by close price |
volatility_20 | Rolling 20-bar std dev of log returns |
high_low_ratio | High/Low ratio β candle range |
gap_up_down | Open vs previous close gap |
Volume (4 features)
Volume (4 features)
| Feature | Description |
|---|---|
volume_ratio | Current volume / 20-bar rolling average |
volume_trend | 5-bar volume slope (increasing vs decreasing) |
tick_volume_norm | Tick volume normalized to [0, 1] |
volume_price_trend | Volume Γ price direction (accumulation/distribution) |
Price structure (6 features)
Price structure (6 features)
| Feature | Description |
|---|---|
support_distance | Distance from nearest support level |
resistance_distance | Distance from nearest resistance level |
candlestick_pattern | Encoded candlestick pattern (doji, hammer, engulfing, etc.) |
bar_body_pct | Body as % of total candle range |
upper_shadow | Upper shadow normalized by range |
lower_shadow | Lower shadow normalized by range |
Session and news (7 features) β PROTECTED
Session and news (7 features) β PROTECTED
These features must never be dropped based on SHAP analysis alone. Historical testing showed removing them increased drawdown significantly. They carry regime and timing information not captured by price alone.
| Feature | Description |
|---|---|
is_london_session | 1 if current UTC hour is in London session (07:00β16:00) |
is_ny_session | 1 if current UTC hour is in NY session (13:00β22:00) |
is_asian_session | 1 if current UTC hour is in Asian session (00:00β09:00) |
is_news_near | 1 if high-impact news event within news_block_minutes |
news_minutes_away | Minutes to next scheduled high-impact event |
news_count_today | Number of high-impact events today |
is_news_risk_window | 1 if within the combined pre/post news risk window |
Account state (7 features)
Account state (7 features)
These features are computed at runtime from live account data, not from candles. They allow the models to adapt to current account health.
| Feature | Description |
|---|---|
drawdown_pct | Current equity drawdown from starting balance, % |
equity_ratio | equity / starting_balance |
win_rate_recent | Win rate over the last 20 trades |
consecutive_losses | Current consecutive loss streak |
trades_today | Number of trades taken today |
profit_today_pct | Todayβs P&L as % of equity |
hours_since_last_trade | Time gap since the last closed trade |
Time features (12 features)
Time features (12 features)
| Feature | Description |
|---|---|
hour | UTC hour (0β23) |
day_of_week | Day encoded 0β6 (0 = Monday) |
is_monday | Flag |
is_friday | Flag |
hour_sin, hour_cos | Cyclical encoding of UTC hour |
dow_sin, dow_cos | Cyclical encoding of day of week |
is_market_open | 1 if within active trading hours |
minutes_to_close | Minutes until session close |
week_of_month | Week number within the month |
month_sin, month_cos | Cyclical month encoding |
Market regime (3 features) β Phase 17.2
Market regime (3 features) β Phase 17.2
Derived entirely from existing OHLCV data β no external API dependencies.
| Feature | Description |
|---|---|
volatility_regime | ATR(14) percentile rank over a rolling 500-bar window, encoded as continuous [0β1]. More responsive than atr_percentile (720-bar). Low = ranging/coiling, high = trending/breakout. |
w1_ema_bias | (close β EMA10) / close on weekly (W1) bars, forward-filled to M15. Positive = price above weekly trend. |
w1_rsi_norm | RSI(14) on W1 bars, normalised to [0, 1] (0.5 = neutral). Captures macro weekly momentum state. |
Labeling
ml/feature_engineering.py uses ATR-aware forward labeling to generate training targets.
For signal model labels:
tp_atr multiplier is tuned during Optuna search. Labels are generated using the same ATR-based SL/TP logic that the live bot uses, ensuring training distribution matches live inference distribution.
For position model labels, ml/position_labeling.py generates EXIT labels when the price subsequently reverses by more than a configurable threshold before reaching TP.
Adding or modifying features
The Three-File Rule applies to all feature changes:ml/feature_engineering.pyβ add the computationml_config.jsonβfeaturesarray β add the name in the correct position- Retrain all 4 models
python ml/hf_hub.py --push to publish the new models and models/model_compat.json to Hugging Face Hub.
Do not add features speculatively. Every added feature increases the risk of overfitting and must be validated with an OOS backtest showing improvement.