Skip to main content
All model binaries (.pkl, .onnx, *_xgb.json, *_lgb.txt) are stored on Cloudflare R2, not in Git. The repository contains only code and small metadata JSON files. Bucket: novosky-models Current tag: v20260415 (Phase 15)
User lane: pull approved revisions only. Developer lane: push approved revisions after weekly optimization/retraining and announce which tag each profile (1-5) should use.
Credentials are required for both push and pull. The bucket is private — there is no anonymous access. Add CF_R2_ACCESS_KEY_ID and CF_R2_SECRET_ACCESS_KEY to .env before running any r2_hub.py command.

Common commands

# Pull latest models (after a fresh clone or on a new server)
python ml/r2_hub.py --pull

# Pull a profile-specific approved revision (recommended for users)
python ml/r2_hub.py --pull --revision vYYYYMMDD-p3

# Push after a retrain (auto-creates tag vYYYYMMDD)
python ml/r2_hub.py --push

# Push directly from the training script
python train_ml_model.py --ensemble --position --push-to-hub

# List all version tags
python ml/r2_hub.py --list

# Roll back to a specific version
python ml/r2_hub.py --pull --revision v20260414

# Create a named tag without uploading new files
python ml/r2_hub.py --tag-only v15-production "Phase 15 validated"

Storage layout

signal/
  ensemble_rf.pkl            # latest signal model (Random Forest)
  ensemble_rf.onnx
  ensemble_xgb.json          # XGBoost native format
  ensemble_lgb.txt           # LightGBM native format
  ensemble_scaler.pkl
  ensemble_btcusd-live_metadata.json

position/
  position_rf.pkl            # latest position model
  position_rf.onnx
  position_xgb.json
  position_lgb.txt
  position_scaler.pkl
  position_metadata.json

model_compat.json            # latest compatibility manifest

v20260415/                   # versioned snapshot (for rollback)
  signal/...
  position/...

tags/
  v20260415.json             # version manifest (metadata + accuracy)

What goes where

LocationWhat
Cloudflare R2*.pkl, *.onnx, *_xgb.json, *_lgb.txt, model_compat.json
GitAll Python code, config.json, training datasets (CSV), metadata JSONs
Gitignored.env, model binaries, models/_snapshot_*/, runtime log files

Auto-pull on startup

The bot automatically pulls models from R2 at startup if models/ensemble_rf.onnx is missing. This is handled by _ensure_models_present() in trading/bot.py. You don’t need to manually pull after a fresh clone — just start the bot.

Rollback procedure

# 1. Pull a specific version
python ml/r2_hub.py --pull --revision v20260414

# 2. Verify compatibility
python3 -c "
import json
mc = json.load(open('models/model_compat.json'))
ml = json.load(open('ml_config.json'))
assert mc['feature_count'] == len(ml['features'])
print('OK:', mc['feature_count'], 'features')
"

# 3. Dry-run to confirm
python trading.py --dry