6.6 KiB
6.6 KiB
CNN Testing & Backtest Guide
📊 CNN Test Cases and Training Data Location
1. Test Scripts
Quick CNN Test (test_cnn_only.py
)
- Purpose: Fast CNN validation with real market data
- Location:
/test_cnn_only.py
- Test Configuration:
- Symbols:
['ETH/USDT']
- Timeframes:
['1m', '5m', '1h']
- Samples:
500
(for quick testing) - Epochs:
2
- Batch size:
16
- Symbols:
- Data Source: Real Binance API data only
- Output:
test_models/quick_cnn.pt
Comprehensive Training Test (test_training.py
)
- Purpose: Full training pipeline validation
- Location:
/test_training.py
- Functions:
test_cnn_training()
- Complete CNN training testtest_rl_training()
- RL training validation
- Output:
test_models/test_cnn.pt
2. Test Model Storage
Directory: /test_models/
- quick_cnn.pt (586KB) - Latest quick test model
- quick_cnn_best.pt (587KB) - Best performing quick test model
- regular_save.pt (384MB) - Full-size training model
- robust_save.pt (17KB) - Optimized lightweight model
- backup models - Automatic backups with
.backup
extension
3. Training Data Sources
Real Market Data (Primary)
- Exchange: Binance API
- Symbols: ETH/USDT, BTC/USDT, etc.
- Timeframes: 1s, 1m, 5m, 15m, 1h, 4h, 1d
- Features: 48 technical indicators calculated from real OHLCV data
- Storage: Cached in
/cache/
directory - Format: JSON files with tick-by-tick and aggregated candle data
Feature Matrix Structure
# Multi-timeframe feature matrix: (timeframes, window_size, features)
feature_matrix.shape = (4, 20, 48) # 4 timeframes, 20 steps, 48 features
# 48 Features include:
features = [
'ad_line', 'adx', 'adx_neg', 'adx_pos', 'atr',
'bb_lower', 'bb_middle', 'bb_percent', 'bb_upper', 'bb_width',
'close', 'ema_12', 'ema_26', 'ema_50', 'high',
'keltner_lower', 'keltner_middle', 'keltner_upper', 'low',
'macd', 'macd_histogram', 'macd_signal', 'mfi', 'momentum_composite',
'obv', 'open', 'price_position', 'psar', 'roc',
'rsi_14', 'rsi_21', 'rsi_7', 'sma_10', 'sma_20', 'sma_50',
'stoch_d', 'stoch_k', 'trend_strength', 'true_range', 'ultimate_osc',
'volatility_regime', 'volume', 'volume_sma_10', 'volume_sma_20',
'volume_sma_50', 'vpt', 'vwap', 'williams_r'
]
4. Test Case Categories
Unit Tests
- Quick validation: 500 samples, 2 epochs
- Performance benchmarks: Speed and accuracy metrics
- Memory usage: Resource consumption monitoring
Integration Tests
- Full pipeline: Data loading → Feature engineering → Training → Evaluation
- Multi-symbol: Testing across different cryptocurrency pairs
- Multi-timeframe: Validation across various time horizons
Backtesting
- Historical performance: Using past market data for validation
- Walk-forward testing: Progressive training on expanding datasets
- Out-of-sample validation: Testing on unseen data periods
5. VSCode Launch Configurations
Quick CNN Test
{
"name": "Quick CNN Test (Real Data + TensorBoard)",
"program": "test_cnn_only.py",
"env": {"PYTHONUNBUFFERED": "1"}
}
Realtime RL Training with Monitoring
{
"name": "Realtime RL Training + TensorBoard + Web UI",
"program": "train_realtime_with_tensorboard.py",
"args": ["--episodes", "50", "--symbol", "ETH/USDT", "--web-port", "8051"]
}
6. Test Execution Commands
Quick CNN Test
# Run quick CNN validation
python test_cnn_only.py
# Monitor training progress
tensorboard --logdir=runs
# Expected output:
# ✅ CNN Training completed!
# Best accuracy: 0.4600
# Total epochs: 2
# Training time: 0.61s
# TensorBoard logs: runs/cnn_training_1748043814
Comprehensive Training Test
# Run full training pipeline test
python test_training.py
# Monitor multiple training modes
tensorboard --logdir=runs
7. Test Data Validation
Real Market Data Policy
- ✅ No Synthetic Data: All training uses authentic exchange data
- ✅ Live API: Direct connection to Binance for real-time prices
- ✅ Multi-timeframe: Consistent data across all time horizons
- ✅ Technical Indicators: Calculated from real OHLCV values
Data Quality Checks
- Completeness: Verifying all required timeframes have data
- Consistency: Cross-timeframe data alignment validation
- Freshness: Ensuring recent market data availability
- Feature integrity: Validating all 48 technical indicators
8. TensorBoard Monitoring
CNN Training Metrics
Training/Loss
- Neural network training lossTraining/Accuracy
- Model prediction accuracyValidation/Loss
- Validation dataset lossValidation/Accuracy
- Out-of-sample accuracyBest/ValidationAccuracy
- Best model performanceData/InputShape
- Feature matrix dimensionsModel/TotalParams
- Neural network parameters
Access URLs
- TensorBoard: http://localhost:6006
- Web Dashboard: http://localhost:8051
- Training Logs:
/runs/
directory
9. Best Practices
Quick Testing
- Start small: Use
test_cnn_only.py
for fast validation - Monitor metrics: Keep TensorBoard open during training
- Check outputs: Verify model files are created in
test_models/
- Validate accuracy: Ensure model performance meets expectations
Production Training
- Use full datasets: Scale up sample sizes for production models
- Multi-symbol training: Train on multiple cryptocurrency pairs
- Extended timeframes: Include longer-term patterns
- Comprehensive validation: Use walk-forward and out-of-sample testing
10. Troubleshooting
Common Issues
- Memory errors: Reduce batch size or sample count
- Data loading failures: Check internet connection and API access
- Feature mismatches: Verify all timeframes have consistent data
- TensorBoard not updating: Restart TensorBoard after training starts
Debug Commands
# Check training status
python monitor_training.py
# Validate data availability
python -c "from core.data_provider import DataProvider; dp = DataProvider(['ETH/USDT']); print(dp.get_historical_data('ETH/USDT', '1m').shape)"
# Test feature generation
python -c "from core.data_provider import DataProvider; dp = DataProvider(['ETH/USDT']); print(dp.get_feature_matrix('ETH/USDT', ['1m', '5m', '1h'], 20).shape)"
🔥 All CNN training and testing uses REAL market data from cryptocurrency exchanges. No synthetic or simulated data is used anywhere in the system.