wip
This commit is contained in:
parent
0fe8286787
commit
310f3c5bf9
196
CNN_TESTING_GUIDE.md
Normal file
196
CNN_TESTING_GUIDE.md
Normal file
@ -0,0 +1,196 @@
|
||||
# CNN Testing & Backtest Guide
|
||||
|
||||
## 📊 **CNN Test Cases and Training Data Location**
|
||||
|
||||
### **1. Test Scripts**
|
||||
|
||||
#### **Quick CNN Test (`test_cnn_only.py`)**
|
||||
- **Purpose**: Fast CNN validation with real market data
|
||||
- **Location**: `/test_cnn_only.py`
|
||||
- **Test Configuration**:
|
||||
- Symbols: `['ETH/USDT']`
|
||||
- Timeframes: `['1m', '5m', '1h']`
|
||||
- Samples: `500` (for quick testing)
|
||||
- Epochs: `2`
|
||||
- Batch size: `16`
|
||||
- **Data Source**: **Real Binance API data only**
|
||||
- **Output**: `test_models/quick_cnn.pt`
|
||||
|
||||
#### **Comprehensive Training Test (`test_training.py`)**
|
||||
- **Purpose**: Full training pipeline validation
|
||||
- **Location**: `/test_training.py`
|
||||
- **Functions**:
|
||||
- `test_cnn_training()` - Complete CNN training test
|
||||
- `test_rl_training()` - RL training validation
|
||||
- **Output**: `test_models/test_cnn.pt`
|
||||
|
||||
### **2. Test Model Storage**
|
||||
|
||||
#### **Directory**: `/test_models/`
|
||||
- **quick_cnn.pt** (586KB) - Latest quick test model
|
||||
- **quick_cnn_best.pt** (587KB) - Best performing quick test model
|
||||
- **regular_save.pt** (384MB) - Full-size training model
|
||||
- **robust_save.pt** (17KB) - Optimized lightweight model
|
||||
- **backup models** - Automatic backups with `.backup` extension
|
||||
|
||||
### **3. Training Data Sources**
|
||||
|
||||
#### **Real Market Data (Primary)**
|
||||
- **Exchange**: Binance API
|
||||
- **Symbols**: ETH/USDT, BTC/USDT, etc.
|
||||
- **Timeframes**: 1s, 1m, 5m, 15m, 1h, 4h, 1d
|
||||
- **Features**: 48 technical indicators calculated from real OHLCV data
|
||||
- **Storage**: Cached in `/cache/` directory
|
||||
- **Format**: JSON files with tick-by-tick and aggregated candle data
|
||||
|
||||
#### **Feature Matrix Structure**
|
||||
```python
|
||||
# Multi-timeframe feature matrix: (timeframes, window_size, features)
|
||||
feature_matrix.shape = (4, 20, 48) # 4 timeframes, 20 steps, 48 features
|
||||
|
||||
# 48 Features include:
|
||||
features = [
|
||||
'ad_line', 'adx', 'adx_neg', 'adx_pos', 'atr',
|
||||
'bb_lower', 'bb_middle', 'bb_percent', 'bb_upper', 'bb_width',
|
||||
'close', 'ema_12', 'ema_26', 'ema_50', 'high',
|
||||
'keltner_lower', 'keltner_middle', 'keltner_upper', 'low',
|
||||
'macd', 'macd_histogram', 'macd_signal', 'mfi', 'momentum_composite',
|
||||
'obv', 'open', 'price_position', 'psar', 'roc',
|
||||
'rsi_14', 'rsi_21', 'rsi_7', 'sma_10', 'sma_20', 'sma_50',
|
||||
'stoch_d', 'stoch_k', 'trend_strength', 'true_range', 'ultimate_osc',
|
||||
'volatility_regime', 'volume', 'volume_sma_10', 'volume_sma_20',
|
||||
'volume_sma_50', 'vpt', 'vwap', 'williams_r'
|
||||
]
|
||||
```
|
||||
|
||||
### **4. Test Case Categories**
|
||||
|
||||
#### **Unit Tests**
|
||||
- **Quick validation**: 500 samples, 2 epochs
|
||||
- **Performance benchmarks**: Speed and accuracy metrics
|
||||
- **Memory usage**: Resource consumption monitoring
|
||||
|
||||
#### **Integration Tests**
|
||||
- **Full pipeline**: Data loading → Feature engineering → Training → Evaluation
|
||||
- **Multi-symbol**: Testing across different cryptocurrency pairs
|
||||
- **Multi-timeframe**: Validation across various time horizons
|
||||
|
||||
#### **Backtesting**
|
||||
- **Historical performance**: Using past market data for validation
|
||||
- **Walk-forward testing**: Progressive training on expanding datasets
|
||||
- **Out-of-sample validation**: Testing on unseen data periods
|
||||
|
||||
### **5. VSCode Launch Configurations**
|
||||
|
||||
#### **Quick CNN Test**
|
||||
```json
|
||||
{
|
||||
"name": "Quick CNN Test (Real Data + TensorBoard)",
|
||||
"program": "test_cnn_only.py",
|
||||
"env": {"PYTHONUNBUFFERED": "1"}
|
||||
}
|
||||
```
|
||||
|
||||
#### **Realtime RL Training with Monitoring**
|
||||
```json
|
||||
{
|
||||
"name": "Realtime RL Training + TensorBoard + Web UI",
|
||||
"program": "train_realtime_with_tensorboard.py",
|
||||
"args": ["--episodes", "50", "--symbol", "ETH/USDT", "--web-port", "8051"]
|
||||
}
|
||||
```
|
||||
|
||||
### **6. Test Execution Commands**
|
||||
|
||||
#### **Quick CNN Test**
|
||||
```bash
|
||||
# Run quick CNN validation
|
||||
python test_cnn_only.py
|
||||
|
||||
# Monitor training progress
|
||||
tensorboard --logdir=runs
|
||||
|
||||
# Expected output:
|
||||
# ✅ CNN Training completed!
|
||||
# Best accuracy: 0.4600
|
||||
# Total epochs: 2
|
||||
# Training time: 0.61s
|
||||
# TensorBoard logs: runs/cnn_training_1748043814
|
||||
```
|
||||
|
||||
#### **Comprehensive Training Test**
|
||||
```bash
|
||||
# Run full training pipeline test
|
||||
python test_training.py
|
||||
|
||||
# Monitor multiple training modes
|
||||
tensorboard --logdir=runs
|
||||
```
|
||||
|
||||
### **7. Test Data Validation**
|
||||
|
||||
#### **Real Market Data Policy**
|
||||
- ✅ **No Synthetic Data**: All training uses authentic exchange data
|
||||
- ✅ **Live API**: Direct connection to Binance for real-time prices
|
||||
- ✅ **Multi-timeframe**: Consistent data across all time horizons
|
||||
- ✅ **Technical Indicators**: Calculated from real OHLCV values
|
||||
|
||||
#### **Data Quality Checks**
|
||||
- **Completeness**: Verifying all required timeframes have data
|
||||
- **Consistency**: Cross-timeframe data alignment validation
|
||||
- **Freshness**: Ensuring recent market data availability
|
||||
- **Feature integrity**: Validating all 48 technical indicators
|
||||
|
||||
### **8. TensorBoard Monitoring**
|
||||
|
||||
#### **CNN Training Metrics**
|
||||
- `Training/Loss` - Neural network training loss
|
||||
- `Training/Accuracy` - Model prediction accuracy
|
||||
- `Validation/Loss` - Validation dataset loss
|
||||
- `Validation/Accuracy` - Out-of-sample accuracy
|
||||
- `Best/ValidationAccuracy` - Best model performance
|
||||
- `Data/InputShape` - Feature matrix dimensions
|
||||
- `Model/TotalParams` - Neural network parameters
|
||||
|
||||
#### **Access URLs**
|
||||
- **TensorBoard**: http://localhost:6006
|
||||
- **Web Dashboard**: http://localhost:8051
|
||||
- **Training Logs**: `/runs/` directory
|
||||
|
||||
### **9. Best Practices**
|
||||
|
||||
#### **Quick Testing**
|
||||
1. **Start small**: Use `test_cnn_only.py` for fast validation
|
||||
2. **Monitor metrics**: Keep TensorBoard open during training
|
||||
3. **Check outputs**: Verify model files are created in `test_models/`
|
||||
4. **Validate accuracy**: Ensure model performance meets expectations
|
||||
|
||||
#### **Production Training**
|
||||
1. **Use full datasets**: Scale up sample sizes for production models
|
||||
2. **Multi-symbol training**: Train on multiple cryptocurrency pairs
|
||||
3. **Extended timeframes**: Include longer-term patterns
|
||||
4. **Comprehensive validation**: Use walk-forward and out-of-sample testing
|
||||
|
||||
### **10. Troubleshooting**
|
||||
|
||||
#### **Common Issues**
|
||||
- **Memory errors**: Reduce batch size or sample count
|
||||
- **Data loading failures**: Check internet connection and API access
|
||||
- **Feature mismatches**: Verify all timeframes have consistent data
|
||||
- **TensorBoard not updating**: Restart TensorBoard after training starts
|
||||
|
||||
#### **Debug Commands**
|
||||
```bash
|
||||
# Check training status
|
||||
python monitor_training.py
|
||||
|
||||
# Validate data availability
|
||||
python -c "from core.data_provider import DataProvider; dp = DataProvider(['ETH/USDT']); print(dp.get_historical_data('ETH/USDT', '1m').shape)"
|
||||
|
||||
# Test feature generation
|
||||
python -c "from core.data_provider import DataProvider; dp = DataProvider(['ETH/USDT']); print(dp.get_feature_matrix('ETH/USDT', ['1m', '5m', '1h'], 20).shape)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**🔥 All CNN training and testing uses REAL market data from cryptocurrency exchanges. No synthetic or simulated data is used anywhere in the system.**
|
@ -139,4 +139,190 @@ python main_clean.py --mode rl
|
||||
- **Feature processing**: 26 indicators in < 1 second
|
||||
- **Memory usage**: Monitored and limited per model
|
||||
- **Chart updates**: 2-second refresh for real-time display
|
||||
- **Decision latency**: Optimized for scalping (< 100ms target)
|
||||
- **Decision latency**: Optimized for scalping (< 100ms target)
|
||||
|
||||
## 🚀 **VSCode Launch Configurations**
|
||||
|
||||
### **1. Core Trading Modes**
|
||||
|
||||
#### **Live Trading (Demo)**
|
||||
```json
|
||||
"name": "Live Trading (Demo)"
|
||||
"program": "main.py"
|
||||
"args": ["--mode", "live", "--demo", "true", "--symbol", "ETH/USDT", "--timeframe", "1m"]
|
||||
```
|
||||
- **Purpose**: Safe demo trading with virtual funds
|
||||
- **Environment**: Paper trading mode
|
||||
- **Risk**: Zero (no real money)
|
||||
|
||||
#### **Live Trading (Real)**
|
||||
```json
|
||||
"name": "Live Trading (Real)"
|
||||
"program": "main.py"
|
||||
"args": ["--mode", "live", "--demo", "false", "--symbol", "ETH/USDT", "--leverage", "50"]
|
||||
```
|
||||
- **Purpose**: Real trading with actual funds
|
||||
- **Environment**: Live exchange API
|
||||
- **Risk**: High (real money)
|
||||
|
||||
### **2. Training & Development Modes**
|
||||
|
||||
#### **Train Bot**
|
||||
```json
|
||||
"name": "Train Bot"
|
||||
"program": "main.py"
|
||||
"args": ["--mode", "train", "--episodes", "100"]
|
||||
```
|
||||
- **Purpose**: Standard RL agent training
|
||||
- **Duration**: 100 episodes
|
||||
- **Output**: Trained model files
|
||||
|
||||
#### **Evaluate Bot**
|
||||
```json
|
||||
"name": "Evaluate Bot"
|
||||
"program": "main.py"
|
||||
"args": ["--mode", "eval", "--episodes", "10"]
|
||||
```
|
||||
- **Purpose**: Model performance evaluation
|
||||
- **Duration**: 10 test episodes
|
||||
- **Output**: Performance metrics
|
||||
|
||||
### **3. Neural Network Training**
|
||||
|
||||
#### **NN Training Pipeline**
|
||||
```json
|
||||
"name": "NN Training Pipeline"
|
||||
"module": "NN.realtime_main"
|
||||
"args": ["--mode", "train", "--model-type", "cnn", "--epochs", "10"]
|
||||
```
|
||||
- **Purpose**: Deep learning model training
|
||||
- **Framework**: PyTorch
|
||||
- **Monitoring**: Automatic TensorBoard integration
|
||||
|
||||
#### **Quick CNN Test (Real Data + TensorBoard)**
|
||||
```json
|
||||
"name": "Quick CNN Test (Real Data + TensorBoard)"
|
||||
"program": "test_cnn_only.py"
|
||||
```
|
||||
- **Purpose**: Fast CNN validation with real market data
|
||||
- **Duration**: 2 epochs, 500 samples
|
||||
- **Output**: `test_models/quick_cnn.pt`
|
||||
- **Monitoring**: TensorBoard metrics
|
||||
|
||||
### **4. 🔥 Realtime RL Training + Monitoring**
|
||||
|
||||
#### **Realtime RL Training + TensorBoard + Web UI**
|
||||
```json
|
||||
"name": "Realtime RL Training + TensorBoard + Web UI"
|
||||
"program": "train_realtime_with_tensorboard.py"
|
||||
"args": ["--episodes", "50", "--symbol", "ETH/USDT", "--web-port", "8051"]
|
||||
```
|
||||
- **Purpose**: Advanced RL training with comprehensive monitoring
|
||||
- **Features**:
|
||||
- Real-time TensorBoard metrics logging
|
||||
- Live web dashboard at http://localhost:8051
|
||||
- Episode rewards, balance tracking, win rates
|
||||
- Trading performance metrics
|
||||
- Agent learning progression
|
||||
- **Data**: 100% real ETH/USDT market data from Binance
|
||||
- **Monitoring**: Dual monitoring (TensorBoard + Web UI)
|
||||
- **Duration**: 50 episodes with real-time feedback
|
||||
|
||||
### **5. Monitoring & Visualization**
|
||||
|
||||
#### **TensorBoard Monitor (All Runs)**
|
||||
```json
|
||||
"name": "TensorBoard Monitor (All Runs)"
|
||||
"program": "run_tensorboard.py"
|
||||
```
|
||||
- **Purpose**: Monitor all training sessions
|
||||
- **Features**: Auto-discovery of training logs
|
||||
- **Access**: http://localhost:6006
|
||||
|
||||
#### **Realtime Charts with NN Inference**
|
||||
```json
|
||||
"name": "Realtime Charts with NN Inference"
|
||||
"program": "realtime.py"
|
||||
```
|
||||
- **Purpose**: Live trading charts with ML predictions
|
||||
- **Features**: Real-time price updates + model inference
|
||||
- **Models**: CNN + RL integration
|
||||
|
||||
### **6. Advanced Training Modes**
|
||||
|
||||
#### **TRAIN Realtime Charts with NN Inference**
|
||||
```json
|
||||
"name": "TRAIN Realtime Charts with NN Inference"
|
||||
"program": "train_rl_with_realtime.py"
|
||||
"args": ["--episodes", "100", "--max-position", "0.1"]
|
||||
```
|
||||
- **Purpose**: RL training with live chart integration
|
||||
- **Features**: Visual training feedback
|
||||
- **Position limit**: 10% portfolio allocation
|
||||
|
||||
## 📊 **Monitoring URLs**
|
||||
|
||||
### **Development**
|
||||
- **TensorBoard**: http://localhost:6006
|
||||
- **Web Dashboard**: http://localhost:8051
|
||||
- **Training Status**: `python monitor_training.py`
|
||||
|
||||
### **Production**
|
||||
- **Live Trading Dashboard**: Integrated in trading interface
|
||||
- **Performance Metrics**: Real-time P&L tracking
|
||||
- **Risk Management**: Position size and drawdown monitoring
|
||||
|
||||
## 🎯 **Quick Start Recommendations**
|
||||
|
||||
### **For CNN Development**
|
||||
1. **Start**: "Quick CNN Test (Real Data + TensorBoard)"
|
||||
2. **Monitor**: Open TensorBoard at http://localhost:6006
|
||||
3. **Validate**: Check `test_models/` for output files
|
||||
|
||||
### **For RL Development**
|
||||
1. **Start**: "Realtime RL Training + TensorBoard + Web UI"
|
||||
2. **Monitor**: TensorBoard (http://localhost:6006) + Web UI (http://localhost:8051)
|
||||
3. **Track**: Episode rewards, balance progression, win rates
|
||||
|
||||
### **For Production Trading**
|
||||
1. **Test**: "Live Trading (Demo)" first
|
||||
2. **Validate**: Confirm strategy performance
|
||||
3. **Deploy**: "Live Trading (Real)" with appropriate risk management
|
||||
|
||||
## ⚡ **Performance Features**
|
||||
|
||||
### **GPU Acceleration**
|
||||
- Automatic CUDA detection and utilization
|
||||
- Mixed precision training support
|
||||
- Memory optimization for large datasets
|
||||
|
||||
### **Real-time Data**
|
||||
- Direct Binance API integration
|
||||
- Multi-timeframe data synchronization
|
||||
- Live price feed with minimal latency
|
||||
|
||||
### **Professional Monitoring**
|
||||
- Industry-standard TensorBoard integration
|
||||
- Custom web dashboards for trading metrics
|
||||
- Real-time performance tracking
|
||||
|
||||
## 🛡️ **Safety Features**
|
||||
|
||||
### **Pre-launch Tasks**
|
||||
- **Kill Stale Processes**: Automatic cleanup before launch
|
||||
- **Port Management**: Intelligent port allocation
|
||||
- **Resource Monitoring**: Memory and GPU usage tracking
|
||||
|
||||
### **Real Market Data Policy**
|
||||
- ✅ **No Synthetic Data**: All training uses authentic exchange data
|
||||
- ✅ **Live API Integration**: Direct connection to cryptocurrency exchanges
|
||||
- ✅ **Data Validation**: Quality checks for completeness and consistency
|
||||
- ✅ **Multi-timeframe Sync**: Aligned data across all time horizons
|
||||
|
||||
---
|
||||
|
||||
✅ **Launch configuration** - Clean, modular mode selection
|
||||
✅ **Professional monitoring** - TensorBoard + custom dashboards
|
||||
✅ **Real market data** - Authentic cryptocurrency price data
|
||||
✅ **Safety features** - Risk management and validation
|
||||
✅ **GPU acceleration** - Optimized for high-performance training
|
160
start_monitoring.py
Normal file
160
start_monitoring.py
Normal file
@ -0,0 +1,160 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Helper script to start monitoring services for RL training
|
||||
"""
|
||||
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
import requests
|
||||
import os
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
# Available ports to try for TensorBoard
|
||||
TENSORBOARD_PORTS = [6006, 6007, 6008, 6009, 6010, 6011, 6012]
|
||||
|
||||
def check_port(port, service_name):
|
||||
"""Check if a service is running on the specified port"""
|
||||
try:
|
||||
response = requests.get(f"http://localhost:{port}", timeout=3)
|
||||
print(f"✅ {service_name} is running on port {port}")
|
||||
return True
|
||||
except requests.exceptions.RequestException:
|
||||
return False
|
||||
|
||||
def is_port_in_use(port):
|
||||
"""Check if a port is already in use"""
|
||||
import socket
|
||||
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
|
||||
try:
|
||||
s.bind(('localhost', port))
|
||||
return False
|
||||
except OSError:
|
||||
return True
|
||||
|
||||
def find_available_port(ports_list, service_name):
|
||||
"""Find an available port from the list"""
|
||||
for port in ports_list:
|
||||
if not is_port_in_use(port):
|
||||
print(f"🔍 Found available port {port} for {service_name}")
|
||||
return port
|
||||
else:
|
||||
print(f"⚠️ Port {port} is already in use")
|
||||
return None
|
||||
|
||||
def save_port_config(tensorboard_port):
|
||||
"""Save the port configuration to a file"""
|
||||
config = {
|
||||
"tensorboard_port": tensorboard_port,
|
||||
"web_dashboard_port": 8051
|
||||
}
|
||||
with open("monitoring_ports.json", "w") as f:
|
||||
json.dump(config, f, indent=2)
|
||||
print(f"💾 Port configuration saved to monitoring_ports.json")
|
||||
|
||||
def start_tensorboard():
|
||||
"""Start TensorBoard in background on an available port"""
|
||||
try:
|
||||
# First check if TensorBoard is already running on any of our ports
|
||||
for port in TENSORBOARD_PORTS:
|
||||
if check_port(port, "TensorBoard"):
|
||||
print(f"✅ TensorBoard already running on port {port}")
|
||||
save_port_config(port)
|
||||
return port
|
||||
|
||||
# Find an available port
|
||||
port = find_available_port(TENSORBOARD_PORTS, "TensorBoard")
|
||||
if port is None:
|
||||
print(f"❌ No available ports found in range {TENSORBOARD_PORTS}")
|
||||
return None
|
||||
|
||||
print(f"🚀 Starting TensorBoard on port {port}...")
|
||||
|
||||
# Create runs directory if it doesn't exist
|
||||
Path("runs").mkdir(exist_ok=True)
|
||||
|
||||
# Start TensorBoard
|
||||
if os.name == 'nt': # Windows
|
||||
subprocess.Popen([
|
||||
sys.executable, "-m", "tensorboard",
|
||||
"--logdir=runs", f"--port={port}", "--reload_interval=1"
|
||||
], creationflags=subprocess.CREATE_NEW_CONSOLE)
|
||||
else: # Linux/Mac
|
||||
subprocess.Popen([
|
||||
sys.executable, "-m", "tensorboard",
|
||||
"--logdir=runs", f"--port={port}", "--reload_interval=1"
|
||||
], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
|
||||
|
||||
# Wait for TensorBoard to start
|
||||
print(f"⏳ Waiting for TensorBoard to start on port {port}...")
|
||||
for i in range(15):
|
||||
time.sleep(2)
|
||||
if check_port(port, "TensorBoard"):
|
||||
save_port_config(port)
|
||||
return port
|
||||
|
||||
print(f"⚠️ TensorBoard failed to start on port {port} within 30 seconds")
|
||||
return None
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error starting TensorBoard: {e}")
|
||||
return None
|
||||
|
||||
def check_web_dashboard_port():
|
||||
"""Check if web dashboard port is available"""
|
||||
port = 8051
|
||||
if is_port_in_use(port):
|
||||
print(f"⚠️ Web dashboard port {port} is in use")
|
||||
# Try alternative ports
|
||||
for alt_port in [8052, 8053, 8054, 8055]:
|
||||
if not is_port_in_use(alt_port):
|
||||
print(f"🔍 Alternative port {alt_port} available for web dashboard")
|
||||
return alt_port
|
||||
print("❌ No alternative ports found for web dashboard")
|
||||
return port
|
||||
else:
|
||||
print(f"✅ Web dashboard port {port} is available")
|
||||
return port
|
||||
|
||||
def main():
|
||||
"""Main function"""
|
||||
print("=" * 60)
|
||||
print("🎯 RL TRAINING MONITORING SETUP")
|
||||
print("=" * 60)
|
||||
|
||||
# Check web dashboard port
|
||||
web_port = check_web_dashboard_port()
|
||||
|
||||
# Start TensorBoard
|
||||
tensorboard_port = start_tensorboard()
|
||||
|
||||
print("\n" + "=" * 60)
|
||||
print("📊 MONITORING STATUS")
|
||||
print("=" * 60)
|
||||
|
||||
if tensorboard_port:
|
||||
print(f"✅ TensorBoard: http://localhost:{tensorboard_port}")
|
||||
# Update port config
|
||||
save_port_config(tensorboard_port)
|
||||
else:
|
||||
print("❌ TensorBoard: Failed to start")
|
||||
print(" Manual start: python -m tensorboard --logdir=runs --port=6007")
|
||||
|
||||
if web_port:
|
||||
print(f"✅ Web Dashboard: Ready on port {web_port}")
|
||||
|
||||
print(f"\n🎯 Ready to start RL training!")
|
||||
if tensorboard_port and web_port != 8051:
|
||||
print(f"Run: python train_realtime_with_tensorboard.py --episodes 10 --web-port {web_port}")
|
||||
else:
|
||||
print("Run: python train_realtime_with_tensorboard.py --episodes 10")
|
||||
|
||||
print(f"\n📋 Available URLs:")
|
||||
if tensorboard_port:
|
||||
print(f" 📊 TensorBoard: http://localhost:{tensorboard_port}")
|
||||
if web_port:
|
||||
print(f" 🌐 Web Dashboard: http://localhost:{web_port} (starts with training)")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
@ -293,6 +293,10 @@ class RealtimeRLTrainer:
|
||||
# Setup environment and agent
|
||||
environment, agent = self.rl_trainer.setup_environment_and_agent()
|
||||
|
||||
# Assign to trainer instance
|
||||
self.rl_trainer.environment = environment
|
||||
self.rl_trainer.agent = agent
|
||||
|
||||
# Training loop
|
||||
for episode in range(episodes):
|
||||
self.current_episode = episode
|
||||
@ -362,6 +366,7 @@ async def main():
|
||||
parser.add_argument('--episodes', type=int, default=50, help='Number of episodes')
|
||||
parser.add_argument('--balance', type=float, default=1000.0, help='Initial balance')
|
||||
parser.add_argument('--web-port', type=int, default=8051, help='Web dashboard port')
|
||||
parser.add_argument('--keep-alive', type=int, default=300, help='Keep monitoring alive for N seconds after training')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
@ -375,6 +380,41 @@ async def main():
|
||||
logger.info(f"Initial Balance: ${args.balance:.2f}")
|
||||
logger.info("=" * 60)
|
||||
|
||||
# Check if TensorBoard is accessible
|
||||
try:
|
||||
import requests
|
||||
import time
|
||||
import json
|
||||
|
||||
# Try to read port configuration
|
||||
tensorboard_port = 6006 # default
|
||||
try:
|
||||
with open("monitoring_ports.json", "r") as f:
|
||||
config = json.load(f)
|
||||
tensorboard_port = config.get("tensorboard_port", 6006)
|
||||
logger.info(f"📋 Using TensorBoard port {tensorboard_port} from config")
|
||||
except FileNotFoundError:
|
||||
logger.info("📋 No port config file found, using default ports")
|
||||
|
||||
logger.info("Checking TensorBoard accessibility...")
|
||||
|
||||
# Wait for TensorBoard to start
|
||||
for i in range(10):
|
||||
try:
|
||||
response = requests.get(f"http://localhost:{tensorboard_port}", timeout=2)
|
||||
logger.info(f"✅ TensorBoard is accessible at http://localhost:{tensorboard_port}")
|
||||
break
|
||||
except requests.exceptions.RequestException:
|
||||
if i == 0:
|
||||
logger.info("⏳ Waiting for TensorBoard to start...")
|
||||
await asyncio.sleep(2)
|
||||
else:
|
||||
logger.warning(f"⚠️ TensorBoard may not be running on port {tensorboard_port}")
|
||||
logger.warning(" Run: python start_monitoring.py")
|
||||
except ImportError:
|
||||
tensorboard_port = 6006
|
||||
logger.warning("requests module not available for TensorBoard check")
|
||||
|
||||
try:
|
||||
# Create trainer
|
||||
trainer = RealtimeRLTrainer(
|
||||
@ -383,14 +423,23 @@ async def main():
|
||||
)
|
||||
|
||||
# Start web dashboard
|
||||
logger.info("🚀 Starting web dashboard...")
|
||||
trainer.start_web_dashboard(port=args.web_port)
|
||||
|
||||
# Wait for dashboard to start
|
||||
await asyncio.sleep(2)
|
||||
await asyncio.sleep(3)
|
||||
|
||||
# Check if web dashboard is accessible
|
||||
try:
|
||||
import requests
|
||||
response = requests.get(f"http://localhost:{args.web_port}", timeout=5)
|
||||
logger.info(f"✅ Web Dashboard is accessible at http://localhost:{args.web_port}")
|
||||
except:
|
||||
logger.warning(f"⚠️ Web Dashboard may not be fully ready at http://localhost:{args.web_port}")
|
||||
|
||||
logger.info("MONITORING READY!")
|
||||
logger.info(f"TensorBoard: http://localhost:6006")
|
||||
logger.info(f"Web Dashboard: http://localhost:{args.web_port}")
|
||||
logger.info(f"📊 TensorBoard: http://localhost:{tensorboard_port}")
|
||||
logger.info(f"🌐 Web Dashboard: http://localhost:{args.web_port}")
|
||||
logger.info("=" * 60)
|
||||
|
||||
# Run training
|
||||
@ -404,10 +453,17 @@ async def main():
|
||||
logger.info(f" Final PnL: ${results['final_pnl']:.2f}")
|
||||
logger.info(f" Model Saved: {results['model_path']}")
|
||||
|
||||
# Keep running for monitoring
|
||||
logger.info("Training complete. Press Ctrl+C to exit monitoring.")
|
||||
while True:
|
||||
await asyncio.sleep(1)
|
||||
# Keep monitoring alive for specified time
|
||||
logger.info(f"🔄 Keeping monitoring alive for {args.keep_alive} seconds...")
|
||||
logger.info(f"📊 TensorBoard: http://localhost:6006")
|
||||
logger.info(f"🌐 Web Dashboard: http://localhost:{args.web_port}")
|
||||
logger.info("Press Ctrl+C to exit monitoring.")
|
||||
|
||||
for remaining in range(args.keep_alive, 0, -10):
|
||||
logger.info(f"⏰ Monitoring active - {remaining} seconds remaining")
|
||||
await asyncio.sleep(10)
|
||||
|
||||
logger.info("✅ Monitoring session completed.")
|
||||
|
||||
except KeyboardInterrupt:
|
||||
logger.info("Training stopped by user")
|
||||
|
Loading…
x
Reference in New Issue
Block a user