gogo2/TODO_IMPROVEMENTS.md
2025-04-02 14:03:20 +03:00

224 lines
8.7 KiB
Markdown

# Cryptocurrency Trading System Improvements
## Overview
This document outlines necessary improvements to our cryptocurrency trading system to enhance performance, profitability, and monitoring capabilities.
## High Priority Tasks
### 1. GPU Utilization for Training
- [x] Fix GPU detection and utilization during training
- [x] Debug why CUDA is detected but not utilized (check logs showing "Starting training on device: cpu")
- [x] Ensure PyTorch correctly detects and uses available CUDA devices
- [x] Add GPU memory monitoring during training
- [x] Optimize batch sizes for GPU training
Implementation status:
- Added `setup_gpu()` function in `train_rl_with_realtime.py` to properly detect and configure GPU usage
- Added device parameter to DQNAgent to ensure models are created on the correct device
- Implemented mixed precision training for faster GPU-based training
- Added GPU memory monitoring and logging to TensorBoard
### 2. Trade Signal Rate Display
- [x] Add metrics to track and display trading frequency
- [x] Implement counter for actions per second/minute/hour
- [x] Add visualization to the chart showing trading frequency over time
- [x] Create a moving average of trade signals to show trends
- [x] Add dashboard section showing current and average trading rates
Implementation status:
- Added trade time tracking in `_add_trade_compat` function
- Added `calculate_trade_rate` method to `RealTimeChart` class
- Updated dashboard layout to display trade rates
- Added visualization of trade frequency in chart's bottom panel
### 3. Reward Function Optimization
- [x] Revise reward function to better balance profit and risk
- [x] Increase transaction fee penalty for more realistic simulation
- [x] Implement progressive rewards based on holding time
- [x] Add penalty for frequent trading (to reduce noise)
- [x] Scale rewards based on market volatility
- [x] Implement risk-adjusted returns (Sharpe ratio) in reward calculation
Implementation status:
- Created `improved_reward_function.py` with `ImprovedRewardCalculator` class
- Implemented Sharpe ratio for risk-adjusted rewards
- Added frequency penalty for excessive trading
- Added holding time rewards for profitable positions
- Integrated with `EnhancedRLTradingEnvironment` class
### 4. Multi-timeframe Price Direction Prediction
- [ ] Extend CNN model to predict price direction for multiple timeframes
- [ ] Modify CNN output to predict short, mid, and long-term price directions
- [ ] Create data generation method for back-propagation using historical data
- [ ] Implement real-time example generation for training
- [ ] Feed direction predictions to RL agent as additional state information
## Medium Priority Tasks
### 5. Position Sizing Optimization
- [ ] Implement dynamic position sizing based on confidence and volatility
- [ ] Add confidence score to model outputs
- [ ] Scale position size based on prediction confidence
- [ ] Implement Kelly criterion for optimal position sizing
### 6. Training Data Augmentation
- [ ] Implement data augmentation for more robust training
- [ ] Simulate different market conditions
- [ ] Add noise to training data
- [ ] Generate synthetic data for rare market events
### 7. Model Interpretability
- [ ] Add visualization for model decision making
- [ ] Implement feature importance analysis
- [ ] Add attention visualization for key price patterns
- [ ] Create explainable AI components
## Implementation Details
### Completed: Displaying Trade Rate
The trade rate display implementation has been completed in the `RealTimeChart` class:
```python
def calculate_trade_rate(self):
"""Calculate and return trading rate statistics based on recent trades"""
if not hasattr(self, 'trade_times') or not self.trade_times:
return {"per_second": 0, "per_minute": 0, "per_hour": 0}
# Get current time
now = datetime.now()
# Calculate different time windows
one_second_ago = now - timedelta(seconds=1)
one_minute_ago = now - timedelta(minutes=1)
one_hour_ago = now - timedelta(hours=1)
# Count trades in different time windows
trades_last_second = sum(1 for t in self.trade_times if t > one_second_ago)
trades_last_minute = sum(1 for t in self.trade_times if t > one_minute_ago)
trades_last_hour = sum(1 for t in self.trade_times if t > one_hour_ago)
# Calculate rates
return {
"per_second": trades_last_second,
"per_minute": trades_last_minute,
"per_hour": trades_last_hour
}
```
### Completed: Improved Reward Function
The improved reward function has been implemented in `improved_reward_function.py`:
```python
def calculate_reward(self, action, price_change, position_held_time=0,
volatility=None, is_profitable=False):
"""
Calculate the improved reward with risk adjustment
"""
# Calculate trading fee
fee = self.base_fee_rate
# Calculate frequency penalty
frequency_penalty = self._calculate_frequency_penalty()
# Base reward calculation
if action == 0: # BUY
# Small penalty for transaction plus frequency penalty
reward = -fee - frequency_penalty
elif action == 1: # SELL
# Calculate profit percentage minus fees (both entry and exit)
profit_pct = price_change
net_profit = profit_pct - (fee * 2)
# Scale reward and apply frequency penalty
reward = net_profit * 10 # Scale reward
reward -= frequency_penalty
# Record PnL for risk adjustment
self.record_pnl(net_profit)
else: # HOLD
# Small reward for holding a profitable position, small cost otherwise
if is_profitable:
reward = self._calculate_holding_reward(position_held_time, price_change)
else:
reward = -0.0001 # Very small negative reward
# Apply risk adjustment if enabled
if self.risk_adjusted:
reward = self._calculate_risk_adjustment(reward)
# Record this action for future frequency calculations
self.record_trade(action=action)
return reward
```
### Completed: GPU Optimization
Added GPU optimization in `train_rl_with_realtime.py`:
```python
def setup_gpu():
"""
Configure GPU usage for PyTorch training
Returns:
tuple: (success, device, message)
"""
try:
if torch.cuda.is_available():
gpu_count = torch.cuda.device_count()
device_info = [torch.cuda.get_device_name(i) for i in range(gpu_count)]
logger.info(f"Found {gpu_count} GPU(s): {', '.join(device_info)}")
device = torch.device("cuda:0")
# Test CUDA by creating a small tensor
test_tensor = torch.tensor([1.0, 2.0, 3.0], device=device)
# Enable mixed precision if supported
if hasattr(torch.cuda, 'amp') and torch.cuda.is_bf16_supported():
logger.info("BFloat16 is supported - enabling for faster training")
return True, device, f"GPU enabled: {device_info}"
else:
return False, torch.device("cpu"), "GPU not available, using CPU"
except Exception as e:
return False, torch.device("cpu"), f"GPU setup failed: {str(e)}"
```
### CNN Price Direction Prediction (To be implemented)
```python
def generate_direction_examples(self, historical_data, timeframes=['1m', '1h', '1d']):
"""Generate price direction examples from historical data"""
examples = []
labels = []
for tf in timeframes:
df = historical_data[tf]
for i in range(20, len(df) - 10):
# Use window of 20 candles for input
window = df.iloc[i-20:i]
# Create labels for future price direction (next 5, 10, 20 candles)
future_5 = df.iloc[i].close < df.iloc[i+5].close # True if price goes up
future_10 = df.iloc[i].close < df.iloc[i+10].close
future_20 = df.iloc[i].close < df.iloc[min(i+20, len(df)-1)].close
examples.append(window.values)
labels.append([future_5, future_10, future_20])
return np.array(examples), np.array(labels)
```
## Validation Plan
After implementing these improvements, we should validate the system with:
1. Backtesting on historical data
2. Forward testing with small position sizes
3. A/B testing of different reward functions
4. Measuring the improvement in profitability and Sharpe ratio
## Progress Tracking
- Implementation started: June 2023
- GPU utilization fixed: July 2023
- Trade signal rate display implemented: July 2023
- Reward function optimized: July 2023
- CNN direction prediction added: To be completed
- Full system tested: To be completed