This commit is contained in:
Dobromir Popov 2025-05-24 09:59:11 +03:00
parent 0fe8286787
commit 310f3c5bf9
4 changed files with 606 additions and 8 deletions

196
CNN_TESTING_GUIDE.md Normal file
View File

@ -0,0 +1,196 @@
# CNN Testing & Backtest Guide
## 📊 **CNN Test Cases and Training Data Location**
### **1. Test Scripts**
#### **Quick CNN Test (`test_cnn_only.py`)**
- **Purpose**: Fast CNN validation with real market data
- **Location**: `/test_cnn_only.py`
- **Test Configuration**:
- Symbols: `['ETH/USDT']`
- Timeframes: `['1m', '5m', '1h']`
- Samples: `500` (for quick testing)
- Epochs: `2`
- Batch size: `16`
- **Data Source**: **Real Binance API data only**
- **Output**: `test_models/quick_cnn.pt`
#### **Comprehensive Training Test (`test_training.py`)**
- **Purpose**: Full training pipeline validation
- **Location**: `/test_training.py`
- **Functions**:
- `test_cnn_training()` - Complete CNN training test
- `test_rl_training()` - RL training validation
- **Output**: `test_models/test_cnn.pt`
### **2. Test Model Storage**
#### **Directory**: `/test_models/`
- **quick_cnn.pt** (586KB) - Latest quick test model
- **quick_cnn_best.pt** (587KB) - Best performing quick test model
- **regular_save.pt** (384MB) - Full-size training model
- **robust_save.pt** (17KB) - Optimized lightweight model
- **backup models** - Automatic backups with `.backup` extension
### **3. Training Data Sources**
#### **Real Market Data (Primary)**
- **Exchange**: Binance API
- **Symbols**: ETH/USDT, BTC/USDT, etc.
- **Timeframes**: 1s, 1m, 5m, 15m, 1h, 4h, 1d
- **Features**: 48 technical indicators calculated from real OHLCV data
- **Storage**: Cached in `/cache/` directory
- **Format**: JSON files with tick-by-tick and aggregated candle data
#### **Feature Matrix Structure**
```python
# Multi-timeframe feature matrix: (timeframes, window_size, features)
feature_matrix.shape = (4, 20, 48) # 4 timeframes, 20 steps, 48 features
# 48 Features include:
features = [
'ad_line', 'adx', 'adx_neg', 'adx_pos', 'atr',
'bb_lower', 'bb_middle', 'bb_percent', 'bb_upper', 'bb_width',
'close', 'ema_12', 'ema_26', 'ema_50', 'high',
'keltner_lower', 'keltner_middle', 'keltner_upper', 'low',
'macd', 'macd_histogram', 'macd_signal', 'mfi', 'momentum_composite',
'obv', 'open', 'price_position', 'psar', 'roc',
'rsi_14', 'rsi_21', 'rsi_7', 'sma_10', 'sma_20', 'sma_50',
'stoch_d', 'stoch_k', 'trend_strength', 'true_range', 'ultimate_osc',
'volatility_regime', 'volume', 'volume_sma_10', 'volume_sma_20',
'volume_sma_50', 'vpt', 'vwap', 'williams_r'
]
```
### **4. Test Case Categories**
#### **Unit Tests**
- **Quick validation**: 500 samples, 2 epochs
- **Performance benchmarks**: Speed and accuracy metrics
- **Memory usage**: Resource consumption monitoring
#### **Integration Tests**
- **Full pipeline**: Data loading → Feature engineering → Training → Evaluation
- **Multi-symbol**: Testing across different cryptocurrency pairs
- **Multi-timeframe**: Validation across various time horizons
#### **Backtesting**
- **Historical performance**: Using past market data for validation
- **Walk-forward testing**: Progressive training on expanding datasets
- **Out-of-sample validation**: Testing on unseen data periods
### **5. VSCode Launch Configurations**
#### **Quick CNN Test**
```json
{
"name": "Quick CNN Test (Real Data + TensorBoard)",
"program": "test_cnn_only.py",
"env": {"PYTHONUNBUFFERED": "1"}
}
```
#### **Realtime RL Training with Monitoring**
```json
{
"name": "Realtime RL Training + TensorBoard + Web UI",
"program": "train_realtime_with_tensorboard.py",
"args": ["--episodes", "50", "--symbol", "ETH/USDT", "--web-port", "8051"]
}
```
### **6. Test Execution Commands**
#### **Quick CNN Test**
```bash
# Run quick CNN validation
python test_cnn_only.py
# Monitor training progress
tensorboard --logdir=runs
# Expected output:
# ✅ CNN Training completed!
# Best accuracy: 0.4600
# Total epochs: 2
# Training time: 0.61s
# TensorBoard logs: runs/cnn_training_1748043814
```
#### **Comprehensive Training Test**
```bash
# Run full training pipeline test
python test_training.py
# Monitor multiple training modes
tensorboard --logdir=runs
```
### **7. Test Data Validation**
#### **Real Market Data Policy**
- ✅ **No Synthetic Data**: All training uses authentic exchange data
- ✅ **Live API**: Direct connection to Binance for real-time prices
- ✅ **Multi-timeframe**: Consistent data across all time horizons
- ✅ **Technical Indicators**: Calculated from real OHLCV values
#### **Data Quality Checks**
- **Completeness**: Verifying all required timeframes have data
- **Consistency**: Cross-timeframe data alignment validation
- **Freshness**: Ensuring recent market data availability
- **Feature integrity**: Validating all 48 technical indicators
### **8. TensorBoard Monitoring**
#### **CNN Training Metrics**
- `Training/Loss` - Neural network training loss
- `Training/Accuracy` - Model prediction accuracy
- `Validation/Loss` - Validation dataset loss
- `Validation/Accuracy` - Out-of-sample accuracy
- `Best/ValidationAccuracy` - Best model performance
- `Data/InputShape` - Feature matrix dimensions
- `Model/TotalParams` - Neural network parameters
#### **Access URLs**
- **TensorBoard**: http://localhost:6006
- **Web Dashboard**: http://localhost:8051
- **Training Logs**: `/runs/` directory
### **9. Best Practices**
#### **Quick Testing**
1. **Start small**: Use `test_cnn_only.py` for fast validation
2. **Monitor metrics**: Keep TensorBoard open during training
3. **Check outputs**: Verify model files are created in `test_models/`
4. **Validate accuracy**: Ensure model performance meets expectations
#### **Production Training**
1. **Use full datasets**: Scale up sample sizes for production models
2. **Multi-symbol training**: Train on multiple cryptocurrency pairs
3. **Extended timeframes**: Include longer-term patterns
4. **Comprehensive validation**: Use walk-forward and out-of-sample testing
### **10. Troubleshooting**
#### **Common Issues**
- **Memory errors**: Reduce batch size or sample count
- **Data loading failures**: Check internet connection and API access
- **Feature mismatches**: Verify all timeframes have consistent data
- **TensorBoard not updating**: Restart TensorBoard after training starts
#### **Debug Commands**
```bash
# Check training status
python monitor_training.py
# Validate data availability
python -c "from core.data_provider import DataProvider; dp = DataProvider(['ETH/USDT']); print(dp.get_historical_data('ETH/USDT', '1m').shape)"
# Test feature generation
python -c "from core.data_provider import DataProvider; dp = DataProvider(['ETH/USDT']); print(dp.get_feature_matrix('ETH/USDT', ['1m', '5m', '1h'], 20).shape)"
```
---
**🔥 All CNN training and testing uses REAL market data from cryptocurrency exchanges. No synthetic or simulated data is used anywhere in the system.**

View File

@ -140,3 +140,189 @@ python main_clean.py --mode rl
- **Memory usage**: Monitored and limited per model
- **Chart updates**: 2-second refresh for real-time display
- **Decision latency**: Optimized for scalping (< 100ms target)
## 🚀 **VSCode Launch Configurations**
### **1. Core Trading Modes**
#### **Live Trading (Demo)**
```json
"name": "Live Trading (Demo)"
"program": "main.py"
"args": ["--mode", "live", "--demo", "true", "--symbol", "ETH/USDT", "--timeframe", "1m"]
```
- **Purpose**: Safe demo trading with virtual funds
- **Environment**: Paper trading mode
- **Risk**: Zero (no real money)
#### **Live Trading (Real)**
```json
"name": "Live Trading (Real)"
"program": "main.py"
"args": ["--mode", "live", "--demo", "false", "--symbol", "ETH/USDT", "--leverage", "50"]
```
- **Purpose**: Real trading with actual funds
- **Environment**: Live exchange API
- **Risk**: High (real money)
### **2. Training & Development Modes**
#### **Train Bot**
```json
"name": "Train Bot"
"program": "main.py"
"args": ["--mode", "train", "--episodes", "100"]
```
- **Purpose**: Standard RL agent training
- **Duration**: 100 episodes
- **Output**: Trained model files
#### **Evaluate Bot**
```json
"name": "Evaluate Bot"
"program": "main.py"
"args": ["--mode", "eval", "--episodes", "10"]
```
- **Purpose**: Model performance evaluation
- **Duration**: 10 test episodes
- **Output**: Performance metrics
### **3. Neural Network Training**
#### **NN Training Pipeline**
```json
"name": "NN Training Pipeline"
"module": "NN.realtime_main"
"args": ["--mode", "train", "--model-type", "cnn", "--epochs", "10"]
```
- **Purpose**: Deep learning model training
- **Framework**: PyTorch
- **Monitoring**: Automatic TensorBoard integration
#### **Quick CNN Test (Real Data + TensorBoard)**
```json
"name": "Quick CNN Test (Real Data + TensorBoard)"
"program": "test_cnn_only.py"
```
- **Purpose**: Fast CNN validation with real market data
- **Duration**: 2 epochs, 500 samples
- **Output**: `test_models/quick_cnn.pt`
- **Monitoring**: TensorBoard metrics
### **4. 🔥 Realtime RL Training + Monitoring**
#### **Realtime RL Training + TensorBoard + Web UI**
```json
"name": "Realtime RL Training + TensorBoard + Web UI"
"program": "train_realtime_with_tensorboard.py"
"args": ["--episodes", "50", "--symbol", "ETH/USDT", "--web-port", "8051"]
```
- **Purpose**: Advanced RL training with comprehensive monitoring
- **Features**:
- Real-time TensorBoard metrics logging
- Live web dashboard at http://localhost:8051
- Episode rewards, balance tracking, win rates
- Trading performance metrics
- Agent learning progression
- **Data**: 100% real ETH/USDT market data from Binance
- **Monitoring**: Dual monitoring (TensorBoard + Web UI)
- **Duration**: 50 episodes with real-time feedback
### **5. Monitoring & Visualization**
#### **TensorBoard Monitor (All Runs)**
```json
"name": "TensorBoard Monitor (All Runs)"
"program": "run_tensorboard.py"
```
- **Purpose**: Monitor all training sessions
- **Features**: Auto-discovery of training logs
- **Access**: http://localhost:6006
#### **Realtime Charts with NN Inference**
```json
"name": "Realtime Charts with NN Inference"
"program": "realtime.py"
```
- **Purpose**: Live trading charts with ML predictions
- **Features**: Real-time price updates + model inference
- **Models**: CNN + RL integration
### **6. Advanced Training Modes**
#### **TRAIN Realtime Charts with NN Inference**
```json
"name": "TRAIN Realtime Charts with NN Inference"
"program": "train_rl_with_realtime.py"
"args": ["--episodes", "100", "--max-position", "0.1"]
```
- **Purpose**: RL training with live chart integration
- **Features**: Visual training feedback
- **Position limit**: 10% portfolio allocation
## 📊 **Monitoring URLs**
### **Development**
- **TensorBoard**: http://localhost:6006
- **Web Dashboard**: http://localhost:8051
- **Training Status**: `python monitor_training.py`
### **Production**
- **Live Trading Dashboard**: Integrated in trading interface
- **Performance Metrics**: Real-time P&L tracking
- **Risk Management**: Position size and drawdown monitoring
## 🎯 **Quick Start Recommendations**
### **For CNN Development**
1. **Start**: "Quick CNN Test (Real Data + TensorBoard)"
2. **Monitor**: Open TensorBoard at http://localhost:6006
3. **Validate**: Check `test_models/` for output files
### **For RL Development**
1. **Start**: "Realtime RL Training + TensorBoard + Web UI"
2. **Monitor**: TensorBoard (http://localhost:6006) + Web UI (http://localhost:8051)
3. **Track**: Episode rewards, balance progression, win rates
### **For Production Trading**
1. **Test**: "Live Trading (Demo)" first
2. **Validate**: Confirm strategy performance
3. **Deploy**: "Live Trading (Real)" with appropriate risk management
## ⚡ **Performance Features**
### **GPU Acceleration**
- Automatic CUDA detection and utilization
- Mixed precision training support
- Memory optimization for large datasets
### **Real-time Data**
- Direct Binance API integration
- Multi-timeframe data synchronization
- Live price feed with minimal latency
### **Professional Monitoring**
- Industry-standard TensorBoard integration
- Custom web dashboards for trading metrics
- Real-time performance tracking
## 🛡️ **Safety Features**
### **Pre-launch Tasks**
- **Kill Stale Processes**: Automatic cleanup before launch
- **Port Management**: Intelligent port allocation
- **Resource Monitoring**: Memory and GPU usage tracking
### **Real Market Data Policy**
- ✅ **No Synthetic Data**: All training uses authentic exchange data
- ✅ **Live API Integration**: Direct connection to cryptocurrency exchanges
- ✅ **Data Validation**: Quality checks for completeness and consistency
- ✅ **Multi-timeframe Sync**: Aligned data across all time horizons
---
**Launch configuration** - Clean, modular mode selection
**Professional monitoring** - TensorBoard + custom dashboards
**Real market data** - Authentic cryptocurrency price data
**Safety features** - Risk management and validation
**GPU acceleration** - Optimized for high-performance training

160
start_monitoring.py Normal file
View File

@ -0,0 +1,160 @@
#!/usr/bin/env python3
"""
Helper script to start monitoring services for RL training
"""
import subprocess
import sys
import time
import requests
import os
import json
from pathlib import Path
# Available ports to try for TensorBoard
TENSORBOARD_PORTS = [6006, 6007, 6008, 6009, 6010, 6011, 6012]
def check_port(port, service_name):
"""Check if a service is running on the specified port"""
try:
response = requests.get(f"http://localhost:{port}", timeout=3)
print(f"{service_name} is running on port {port}")
return True
except requests.exceptions.RequestException:
return False
def is_port_in_use(port):
"""Check if a port is already in use"""
import socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
try:
s.bind(('localhost', port))
return False
except OSError:
return True
def find_available_port(ports_list, service_name):
"""Find an available port from the list"""
for port in ports_list:
if not is_port_in_use(port):
print(f"🔍 Found available port {port} for {service_name}")
return port
else:
print(f"⚠️ Port {port} is already in use")
return None
def save_port_config(tensorboard_port):
"""Save the port configuration to a file"""
config = {
"tensorboard_port": tensorboard_port,
"web_dashboard_port": 8051
}
with open("monitoring_ports.json", "w") as f:
json.dump(config, f, indent=2)
print(f"💾 Port configuration saved to monitoring_ports.json")
def start_tensorboard():
"""Start TensorBoard in background on an available port"""
try:
# First check if TensorBoard is already running on any of our ports
for port in TENSORBOARD_PORTS:
if check_port(port, "TensorBoard"):
print(f"✅ TensorBoard already running on port {port}")
save_port_config(port)
return port
# Find an available port
port = find_available_port(TENSORBOARD_PORTS, "TensorBoard")
if port is None:
print(f"❌ No available ports found in range {TENSORBOARD_PORTS}")
return None
print(f"🚀 Starting TensorBoard on port {port}...")
# Create runs directory if it doesn't exist
Path("runs").mkdir(exist_ok=True)
# Start TensorBoard
if os.name == 'nt': # Windows
subprocess.Popen([
sys.executable, "-m", "tensorboard",
"--logdir=runs", f"--port={port}", "--reload_interval=1"
], creationflags=subprocess.CREATE_NEW_CONSOLE)
else: # Linux/Mac
subprocess.Popen([
sys.executable, "-m", "tensorboard",
"--logdir=runs", f"--port={port}", "--reload_interval=1"
], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
# Wait for TensorBoard to start
print(f"⏳ Waiting for TensorBoard to start on port {port}...")
for i in range(15):
time.sleep(2)
if check_port(port, "TensorBoard"):
save_port_config(port)
return port
print(f"⚠️ TensorBoard failed to start on port {port} within 30 seconds")
return None
except Exception as e:
print(f"❌ Error starting TensorBoard: {e}")
return None
def check_web_dashboard_port():
"""Check if web dashboard port is available"""
port = 8051
if is_port_in_use(port):
print(f"⚠️ Web dashboard port {port} is in use")
# Try alternative ports
for alt_port in [8052, 8053, 8054, 8055]:
if not is_port_in_use(alt_port):
print(f"🔍 Alternative port {alt_port} available for web dashboard")
return alt_port
print("❌ No alternative ports found for web dashboard")
return port
else:
print(f"✅ Web dashboard port {port} is available")
return port
def main():
"""Main function"""
print("=" * 60)
print("🎯 RL TRAINING MONITORING SETUP")
print("=" * 60)
# Check web dashboard port
web_port = check_web_dashboard_port()
# Start TensorBoard
tensorboard_port = start_tensorboard()
print("\n" + "=" * 60)
print("📊 MONITORING STATUS")
print("=" * 60)
if tensorboard_port:
print(f"✅ TensorBoard: http://localhost:{tensorboard_port}")
# Update port config
save_port_config(tensorboard_port)
else:
print("❌ TensorBoard: Failed to start")
print(" Manual start: python -m tensorboard --logdir=runs --port=6007")
if web_port:
print(f"✅ Web Dashboard: Ready on port {web_port}")
print(f"\n🎯 Ready to start RL training!")
if tensorboard_port and web_port != 8051:
print(f"Run: python train_realtime_with_tensorboard.py --episodes 10 --web-port {web_port}")
else:
print("Run: python train_realtime_with_tensorboard.py --episodes 10")
print(f"\n📋 Available URLs:")
if tensorboard_port:
print(f" 📊 TensorBoard: http://localhost:{tensorboard_port}")
if web_port:
print(f" 🌐 Web Dashboard: http://localhost:{web_port} (starts with training)")
if __name__ == "__main__":
main()

View File

@ -293,6 +293,10 @@ class RealtimeRLTrainer:
# Setup environment and agent
environment, agent = self.rl_trainer.setup_environment_and_agent()
# Assign to trainer instance
self.rl_trainer.environment = environment
self.rl_trainer.agent = agent
# Training loop
for episode in range(episodes):
self.current_episode = episode
@ -362,6 +366,7 @@ async def main():
parser.add_argument('--episodes', type=int, default=50, help='Number of episodes')
parser.add_argument('--balance', type=float, default=1000.0, help='Initial balance')
parser.add_argument('--web-port', type=int, default=8051, help='Web dashboard port')
parser.add_argument('--keep-alive', type=int, default=300, help='Keep monitoring alive for N seconds after training')
args = parser.parse_args()
@ -375,6 +380,41 @@ async def main():
logger.info(f"Initial Balance: ${args.balance:.2f}")
logger.info("=" * 60)
# Check if TensorBoard is accessible
try:
import requests
import time
import json
# Try to read port configuration
tensorboard_port = 6006 # default
try:
with open("monitoring_ports.json", "r") as f:
config = json.load(f)
tensorboard_port = config.get("tensorboard_port", 6006)
logger.info(f"📋 Using TensorBoard port {tensorboard_port} from config")
except FileNotFoundError:
logger.info("📋 No port config file found, using default ports")
logger.info("Checking TensorBoard accessibility...")
# Wait for TensorBoard to start
for i in range(10):
try:
response = requests.get(f"http://localhost:{tensorboard_port}", timeout=2)
logger.info(f"✅ TensorBoard is accessible at http://localhost:{tensorboard_port}")
break
except requests.exceptions.RequestException:
if i == 0:
logger.info("⏳ Waiting for TensorBoard to start...")
await asyncio.sleep(2)
else:
logger.warning(f"⚠️ TensorBoard may not be running on port {tensorboard_port}")
logger.warning(" Run: python start_monitoring.py")
except ImportError:
tensorboard_port = 6006
logger.warning("requests module not available for TensorBoard check")
try:
# Create trainer
trainer = RealtimeRLTrainer(
@ -383,14 +423,23 @@ async def main():
)
# Start web dashboard
logger.info("🚀 Starting web dashboard...")
trainer.start_web_dashboard(port=args.web_port)
# Wait for dashboard to start
await asyncio.sleep(2)
await asyncio.sleep(3)
# Check if web dashboard is accessible
try:
import requests
response = requests.get(f"http://localhost:{args.web_port}", timeout=5)
logger.info(f"✅ Web Dashboard is accessible at http://localhost:{args.web_port}")
except:
logger.warning(f"⚠️ Web Dashboard may not be fully ready at http://localhost:{args.web_port}")
logger.info("MONITORING READY!")
logger.info(f"TensorBoard: http://localhost:6006")
logger.info(f"Web Dashboard: http://localhost:{args.web_port}")
logger.info(f"📊 TensorBoard: http://localhost:{tensorboard_port}")
logger.info(f"🌐 Web Dashboard: http://localhost:{args.web_port}")
logger.info("=" * 60)
# Run training
@ -404,10 +453,17 @@ async def main():
logger.info(f" Final PnL: ${results['final_pnl']:.2f}")
logger.info(f" Model Saved: {results['model_path']}")
# Keep running for monitoring
logger.info("Training complete. Press Ctrl+C to exit monitoring.")
while True:
await asyncio.sleep(1)
# Keep monitoring alive for specified time
logger.info(f"🔄 Keeping monitoring alive for {args.keep_alive} seconds...")
logger.info(f"📊 TensorBoard: http://localhost:6006")
logger.info(f"🌐 Web Dashboard: http://localhost:{args.web_port}")
logger.info("Press Ctrl+C to exit monitoring.")
for remaining in range(args.keep_alive, 0, -10):
logger.info(f"⏰ Monitoring active - {remaining} seconds remaining")
await asyncio.sleep(10)
logger.info("✅ Monitoring session completed.")
except KeyboardInterrupt:
logger.info("Training stopped by user")