# CNN Testing & Backtest Guide ## 📊 **CNN Test Cases and Training Data Location** ### **1. Test Scripts** #### **Quick CNN Test (`test_cnn_only.py`)** - **Purpose**: Fast CNN validation with real market data - **Location**: `/test_cnn_only.py` - **Test Configuration**: - Symbols: `['ETH/USDT']` - Timeframes: `['1m', '5m', '1h']` - Samples: `500` (for quick testing) - Epochs: `2` - Batch size: `16` - **Data Source**: **Real Binance API data only** - **Output**: `test_models/quick_cnn.pt` #### **Comprehensive Training Test (`test_training.py`)** - **Purpose**: Full training pipeline validation - **Location**: `/test_training.py` - **Functions**: - `test_cnn_training()` - Complete CNN training test - `test_rl_training()` - RL training validation - **Output**: `test_models/test_cnn.pt` ### **2. Test Model Storage** #### **Directory**: `/test_models/` - **quick_cnn.pt** (586KB) - Latest quick test model - **quick_cnn_best.pt** (587KB) - Best performing quick test model - **regular_save.pt** (384MB) - Full-size training model - **robust_save.pt** (17KB) - Optimized lightweight model - **backup models** - Automatic backups with `.backup` extension ### **3. Training Data Sources** #### **Real Market Data (Primary)** - **Exchange**: Binance API - **Symbols**: ETH/USDT, BTC/USDT, etc. - **Timeframes**: 1s, 1m, 5m, 15m, 1h, 4h, 1d - **Features**: 48 technical indicators calculated from real OHLCV data - **Storage**: Cached in `/cache/` directory - **Format**: JSON files with tick-by-tick and aggregated candle data #### **Feature Matrix Structure** ```python # Multi-timeframe feature matrix: (timeframes, window_size, features) feature_matrix.shape = (4, 20, 48) # 4 timeframes, 20 steps, 48 features # 48 Features include: features = [ 'ad_line', 'adx', 'adx_neg', 'adx_pos', 'atr', 'bb_lower', 'bb_middle', 'bb_percent', 'bb_upper', 'bb_width', 'close', 'ema_12', 'ema_26', 'ema_50', 'high', 'keltner_lower', 'keltner_middle', 'keltner_upper', 'low', 'macd', 'macd_histogram', 'macd_signal', 'mfi', 'momentum_composite', 'obv', 'open', 'price_position', 'psar', 'roc', 'rsi_14', 'rsi_21', 'rsi_7', 'sma_10', 'sma_20', 'sma_50', 'stoch_d', 'stoch_k', 'trend_strength', 'true_range', 'ultimate_osc', 'volatility_regime', 'volume', 'volume_sma_10', 'volume_sma_20', 'volume_sma_50', 'vpt', 'vwap', 'williams_r' ] ``` ### **4. Test Case Categories** #### **Unit Tests** - **Quick validation**: 500 samples, 2 epochs - **Performance benchmarks**: Speed and accuracy metrics - **Memory usage**: Resource consumption monitoring #### **Integration Tests** - **Full pipeline**: Data loading → Feature engineering → Training → Evaluation - **Multi-symbol**: Testing across different cryptocurrency pairs - **Multi-timeframe**: Validation across various time horizons #### **Backtesting** - **Historical performance**: Using past market data for validation - **Walk-forward testing**: Progressive training on expanding datasets - **Out-of-sample validation**: Testing on unseen data periods ### **5. VSCode Launch Configurations** #### **Quick CNN Test** ```json { "name": "Quick CNN Test (Real Data + TensorBoard)", "program": "test_cnn_only.py", "env": {"PYTHONUNBUFFERED": "1"} } ``` #### **Realtime RL Training with Monitoring** ```json { "name": "Realtime RL Training + TensorBoard + Web UI", "program": "train_realtime_with_tensorboard.py", "args": ["--episodes", "50", "--symbol", "ETH/USDT", "--web-port", "8051"] } ``` ### **6. Test Execution Commands** #### **Quick CNN Test** ```bash # Run quick CNN validation python test_cnn_only.py # Monitor training progress tensorboard --logdir=runs # Expected output: # ✅ CNN Training completed! # Best accuracy: 0.4600 # Total epochs: 2 # Training time: 0.61s # TensorBoard logs: runs/cnn_training_1748043814 ``` #### **Comprehensive Training Test** ```bash # Run full training pipeline test python test_training.py # Monitor multiple training modes tensorboard --logdir=runs ``` ### **7. Test Data Validation** #### **Real Market Data Policy** - ✅ **No Synthetic Data**: All training uses authentic exchange data - ✅ **Live API**: Direct connection to Binance for real-time prices - ✅ **Multi-timeframe**: Consistent data across all time horizons - ✅ **Technical Indicators**: Calculated from real OHLCV values #### **Data Quality Checks** - **Completeness**: Verifying all required timeframes have data - **Consistency**: Cross-timeframe data alignment validation - **Freshness**: Ensuring recent market data availability - **Feature integrity**: Validating all 48 technical indicators ### **8. TensorBoard Monitoring** #### **CNN Training Metrics** - `Training/Loss` - Neural network training loss - `Training/Accuracy` - Model prediction accuracy - `Validation/Loss` - Validation dataset loss - `Validation/Accuracy` - Out-of-sample accuracy - `Best/ValidationAccuracy` - Best model performance - `Data/InputShape` - Feature matrix dimensions - `Model/TotalParams` - Neural network parameters #### **Access URLs** - **TensorBoard**: http://localhost:6006 - **Web Dashboard**: http://localhost:8051 - **Training Logs**: `/runs/` directory ### **9. Best Practices** #### **Quick Testing** 1. **Start small**: Use `test_cnn_only.py` for fast validation 2. **Monitor metrics**: Keep TensorBoard open during training 3. **Check outputs**: Verify model files are created in `test_models/` 4. **Validate accuracy**: Ensure model performance meets expectations #### **Production Training** 1. **Use full datasets**: Scale up sample sizes for production models 2. **Multi-symbol training**: Train on multiple cryptocurrency pairs 3. **Extended timeframes**: Include longer-term patterns 4. **Comprehensive validation**: Use walk-forward and out-of-sample testing ### **10. Troubleshooting** #### **Common Issues** - **Memory errors**: Reduce batch size or sample count - **Data loading failures**: Check internet connection and API access - **Feature mismatches**: Verify all timeframes have consistent data - **TensorBoard not updating**: Restart TensorBoard after training starts #### **Debug Commands** ```bash # Check training status python monitor_training.py # Validate data availability python -c "from core.data_provider import DataProvider; dp = DataProvider(['ETH/USDT']); print(dp.get_historical_data('ETH/USDT', '1m').shape)" # Test feature generation python -c "from core.data_provider import DataProvider; dp = DataProvider(['ETH/USDT']); print(dp.get_feature_matrix('ETH/USDT', ['1m', '5m', '1h'], 20).shape)" ``` --- **🔥 All CNN training and testing uses REAL market data from cryptocurrency exchanges. No synthetic or simulated data is used anywhere in the system.**