# Multi-Horizon Training System Documentation ## Overview The Multi-Horizon Training System addresses the core issues with your current training approach: ### Problems with Current System 1. **Immediate Training**: Training happens right after trades close (couple seconds), often before meaningful price movement 2. **No Profit Potential**: Small timeframes don't provide enough movement for profitable trades 3. **Reactive Training**: Models learn from very short-term outcomes rather than longer-term patterns 4. **Limited Prediction Horizons**: Only predicts short timeframes that may not capture meaningful market moves ### New System Benefits 1. **Multi-Timeframe Predictions**: Predicts 1m, 5m, 15m, and 60m horizons every minute 2. **Deferred Training**: Stores predictions and trains models when outcomes are actually known 3. **Min/Max Price Prediction**: Focuses on predicting price ranges over longer periods for better profit potential 4. **Backtesting Capability**: Can validate system performance on historical data 5. **Scalable Storage**: Efficiently stores model inputs for future training ## System Components ### 1. MultiHorizonPredictionManager (`core/multi_horizon_prediction_manager.py`) - Generates predictions for 1, 5, 15, and 60-minute horizons every minute - Uses ensemble approach combining CNN, RL, and technical analysis - Stores prediction snapshots with full model inputs for future training **Key Features:** - Real-time prediction generation - Confidence-based filtering - Automatic validation when target times are reached ### 2. PredictionSnapshotStorage (`core/prediction_snapshot_storage.py`) - Efficiently stores prediction snapshots to disk - SQLite metadata database with compression - Batch retrieval for training - Automatic cleanup of old data **Storage Structure:** - Compressed pickle files for snapshot data - SQLite database for fast metadata queries - Organized by symbol and prediction horizon ### 3. MultiHorizonTrainer (`core/multi_horizon_trainer.py`) - Trains models when prediction outcomes are known - Handles both CNN and RL model training - Uses stored snapshots to recreate training scenarios **Training Process:** - Validates pending predictions against actual price data - Trains models using historical prediction accuracy - Supports batch training for efficiency ### 4. MultiHorizonBacktester (`core/multi_horizon_backtester.py`) - Backtests prediction accuracy on historical data - Validates system performance before deployment - Provides detailed accuracy and profitability analysis **Backtesting Features:** - Historical data simulation - Accuracy metrics by prediction horizon - Profitability analysis - Performance reporting ### 5. Enhanced DataProvider (`core/data_provider.py`) - Added `get_price_range_over_period()` method - Supports min/max price queries over specific time ranges - Better integration with backtesting framework ## Usage Examples ### Running the System ```bash # Run demonstration python run_multi_horizon_training.py --mode demo # Run backtest on 7 days of data python run_multi_horizon_training.py --mode backtest --symbol ETH/USDT --days 7 # Force training session python run_multi_horizon_training.py --mode train --horizon 60 # Run system for 5 minutes python run_multi_horizon_training.py --mode run --runtime 300 ``` ### Integration with Existing Code ```python from core.multi_horizon_prediction_manager import MultiHorizonPredictionManager from core.prediction_snapshot_storage import PredictionSnapshotStorage from core.multi_horizon_trainer import MultiHorizonTrainer # Initialize components prediction_manager = MultiHorizonPredictionManager(orchestrator=your_orchestrator) snapshot_storage = PredictionSnapshotStorage() trainer = MultiHorizonTrainer(orchestrator=your_orchestrator, snapshot_storage=snapshot_storage) # Start the system prediction_manager.start() trainer.start() # Get system status status = prediction_manager.get_prediction_stats() training_stats = trainer.get_training_stats() ``` ## Prediction Horizons The system generates predictions for four horizons: - **1 minute**: Very short-term predictions for scalping - **5 minutes**: Short-term momentum predictions - **15 minutes**: Medium-term trend predictions - **60 minutes**: Long-term range predictions (focus area for meaningful moves) Each prediction includes: - Predicted minimum price - Predicted maximum price - Confidence score - Model inputs for training - Market state snapshot ## Training Strategy ### When Training Occurs - Predictions are generated every minute - Models are trained when prediction target times are reached (1-60 minutes later) - Training uses the full context available at prediction time - Rewards are based on prediction accuracy within the predicted price range ### Model Types Supported 1. **CNN Models**: Trained on feature sequences to predict price ranges 2. **RL Models**: Trained with reinforcement learning on prediction outcomes 3. **Ensemble**: Combines multiple model predictions for better accuracy ## Backtesting and Validation ### Backtesting Process 1. Load historical 1-minute data 2. Simulate predictions at regular intervals 3. Wait for target time to check actual outcomes 4. Calculate accuracy and profitability metrics ### Key Metrics - **Range Accuracy**: How well predicted min/max ranges match actual ranges - **Confidence Correlation**: How confidence scores relate to prediction accuracy - **Profitability**: Simulated trading performance based on predictions ## Performance Analysis ### Expected Improvements 1. **Better Profit Potential**: 60-minute predictions allow for meaningful price moves 2. **More Stable Training**: Training occurs on known outcomes, not immediate reactions 3. **Reduced Overfitting**: Multi-horizon approach prevents overfitting to short-term noise 4. **Backtesting Validation**: Historical testing ensures system robustness ### Monitoring The system provides comprehensive monitoring: - Prediction generation rates - Training session statistics - Model accuracy by horizon - Storage utilization - System health metrics ## Configuration ### Key Parameters ```python # Prediction horizons (minutes) horizons = [1, 5, 15, 60] # Prediction frequency prediction_interval_seconds = 60 # Minimum confidence for storage min_confidence_threshold = 0.3 # Training batch size batch_size = 32 # Storage retention max_age_days = 30 ``` ### File Locations - Prediction snapshots: `data/prediction_snapshots/` - Backtest results: `reports/` - Cache data: `cache/` ## Integration with Existing Dashboard The system is designed to integrate with your existing dashboard: 1. **Real-time Monitoring**: Dashboard can display prediction generation stats 2. **Training Progress**: Show training session results 3. **Backtest Reports**: Display historical performance analysis 4. **Model Comparison**: Compare old vs new training approaches ## Migration Path ### Gradual Adoption 1. **Run in Parallel**: Run new system alongside existing training 2. **Compare Performance**: Use backtesting to compare approaches 3. **Gradual Transition**: Move models to new training system incrementally 4. **Fallback Support**: Keep old system as backup during transition ### Data Compatibility - New system stores snapshots independently - Existing model weights can be used as starting points - Training data format is compatible with existing models ## Troubleshooting ### Common Issues 1. **Low Prediction Accuracy**: Check confidence thresholds and feature quality 2. **Storage Issues**: Monitor disk space and cleanup old snapshots 3. **Training Performance**: Adjust batch sizes and learning rates 4. **Memory Usage**: Use appropriate cache sizes for your hardware ### Logging All components use structured logging with consistent log levels: - `INFO`: Normal operations and results - `WARNING`: Potential issues that don't stop operation - `ERROR`: Serious problems requiring attention ## Future Enhancements ### Planned Features 1. **Advanced Ensemble Methods**: More sophisticated model combination 2. **Adaptive Horizons**: Dynamic horizon selection based on market conditions 3. **Cross-Symbol Training**: Train models using data from multiple symbols 4. **Real-time Validation**: Immediate feedback on prediction quality 5. **Performance Optimization**: GPU acceleration and distributed training ### Research Directions 1. **Optimal Horizon Selection**: Which horizons provide best risk-adjusted returns 2. **Market Regime Detection**: Adjust predictions based on market conditions 3. **Feature Engineering**: Better input features for price range prediction 4. **Uncertainty Quantification**: Better confidence score calibration ## Conclusion The Multi-Horizon Training System addresses your core concerns by: 1. **Extending Prediction Horizons**: From seconds to 60 minutes for meaningful profit potential 2. **Deferred Training**: Models learn from actual outcomes, not immediate reactions 3. **Comprehensive Storage**: Full model inputs preserved for future training 4. **Backtesting Validation**: Historical testing ensures system effectiveness 5. **Scalable Architecture**: Efficient storage and training for long-term operation This system should significantly improve your trading performance by focusing on longer-term, more profitable price movements while maintaining rigorous training and validation processes.