224 lines
8.7 KiB
Markdown
224 lines
8.7 KiB
Markdown
# Cryptocurrency Trading System Improvements
|
|
|
|
## Overview
|
|
This document outlines necessary improvements to our cryptocurrency trading system to enhance performance, profitability, and monitoring capabilities.
|
|
|
|
## High Priority Tasks
|
|
|
|
### 1. GPU Utilization for Training
|
|
- [x] Fix GPU detection and utilization during training
|
|
- [x] Debug why CUDA is detected but not utilized (check logs showing "Starting training on device: cpu")
|
|
- [x] Ensure PyTorch correctly detects and uses available CUDA devices
|
|
- [x] Add GPU memory monitoring during training
|
|
- [x] Optimize batch sizes for GPU training
|
|
|
|
Implementation status:
|
|
- Added `setup_gpu()` function in `train_rl_with_realtime.py` to properly detect and configure GPU usage
|
|
- Added device parameter to DQNAgent to ensure models are created on the correct device
|
|
- Implemented mixed precision training for faster GPU-based training
|
|
- Added GPU memory monitoring and logging to TensorBoard
|
|
|
|
### 2. Trade Signal Rate Display
|
|
- [x] Add metrics to track and display trading frequency
|
|
- [x] Implement counter for actions per second/minute/hour
|
|
- [x] Add visualization to the chart showing trading frequency over time
|
|
- [x] Create a moving average of trade signals to show trends
|
|
- [x] Add dashboard section showing current and average trading rates
|
|
|
|
Implementation status:
|
|
- Added trade time tracking in `_add_trade_compat` function
|
|
- Added `calculate_trade_rate` method to `RealTimeChart` class
|
|
- Updated dashboard layout to display trade rates
|
|
- Added visualization of trade frequency in chart's bottom panel
|
|
|
|
### 3. Reward Function Optimization
|
|
- [x] Revise reward function to better balance profit and risk
|
|
- [x] Increase transaction fee penalty for more realistic simulation
|
|
- [x] Implement progressive rewards based on holding time
|
|
- [x] Add penalty for frequent trading (to reduce noise)
|
|
- [x] Scale rewards based on market volatility
|
|
- [x] Implement risk-adjusted returns (Sharpe ratio) in reward calculation
|
|
|
|
Implementation status:
|
|
- Created `improved_reward_function.py` with `ImprovedRewardCalculator` class
|
|
- Implemented Sharpe ratio for risk-adjusted rewards
|
|
- Added frequency penalty for excessive trading
|
|
- Added holding time rewards for profitable positions
|
|
- Integrated with `EnhancedRLTradingEnvironment` class
|
|
|
|
### 4. Multi-timeframe Price Direction Prediction
|
|
- [ ] Extend CNN model to predict price direction for multiple timeframes
|
|
- [ ] Modify CNN output to predict short, mid, and long-term price directions
|
|
- [ ] Create data generation method for back-propagation using historical data
|
|
- [ ] Implement real-time example generation for training
|
|
- [ ] Feed direction predictions to RL agent as additional state information
|
|
|
|
## Medium Priority Tasks
|
|
|
|
### 5. Position Sizing Optimization
|
|
- [ ] Implement dynamic position sizing based on confidence and volatility
|
|
- [ ] Add confidence score to model outputs
|
|
- [ ] Scale position size based on prediction confidence
|
|
- [ ] Implement Kelly criterion for optimal position sizing
|
|
|
|
### 6. Training Data Augmentation
|
|
- [ ] Implement data augmentation for more robust training
|
|
- [ ] Simulate different market conditions
|
|
- [ ] Add noise to training data
|
|
- [ ] Generate synthetic data for rare market events
|
|
|
|
### 7. Model Interpretability
|
|
- [ ] Add visualization for model decision making
|
|
- [ ] Implement feature importance analysis
|
|
- [ ] Add attention visualization for key price patterns
|
|
- [ ] Create explainable AI components
|
|
|
|
## Implementation Details
|
|
|
|
### Completed: Displaying Trade Rate
|
|
The trade rate display implementation has been completed in the `RealTimeChart` class:
|
|
```python
|
|
def calculate_trade_rate(self):
|
|
"""Calculate and return trading rate statistics based on recent trades"""
|
|
if not hasattr(self, 'trade_times') or not self.trade_times:
|
|
return {"per_second": 0, "per_minute": 0, "per_hour": 0}
|
|
|
|
# Get current time
|
|
now = datetime.now()
|
|
|
|
# Calculate different time windows
|
|
one_second_ago = now - timedelta(seconds=1)
|
|
one_minute_ago = now - timedelta(minutes=1)
|
|
one_hour_ago = now - timedelta(hours=1)
|
|
|
|
# Count trades in different time windows
|
|
trades_last_second = sum(1 for t in self.trade_times if t > one_second_ago)
|
|
trades_last_minute = sum(1 for t in self.trade_times if t > one_minute_ago)
|
|
trades_last_hour = sum(1 for t in self.trade_times if t > one_hour_ago)
|
|
|
|
# Calculate rates
|
|
return {
|
|
"per_second": trades_last_second,
|
|
"per_minute": trades_last_minute,
|
|
"per_hour": trades_last_hour
|
|
}
|
|
```
|
|
|
|
### Completed: Improved Reward Function
|
|
The improved reward function has been implemented in `improved_reward_function.py`:
|
|
```python
|
|
def calculate_reward(self, action, price_change, position_held_time=0,
|
|
volatility=None, is_profitable=False):
|
|
"""
|
|
Calculate the improved reward with risk adjustment
|
|
"""
|
|
# Calculate trading fee
|
|
fee = self.base_fee_rate
|
|
|
|
# Calculate frequency penalty
|
|
frequency_penalty = self._calculate_frequency_penalty()
|
|
|
|
# Base reward calculation
|
|
if action == 0: # BUY
|
|
# Small penalty for transaction plus frequency penalty
|
|
reward = -fee - frequency_penalty
|
|
|
|
elif action == 1: # SELL
|
|
# Calculate profit percentage minus fees (both entry and exit)
|
|
profit_pct = price_change
|
|
net_profit = profit_pct - (fee * 2)
|
|
|
|
# Scale reward and apply frequency penalty
|
|
reward = net_profit * 10 # Scale reward
|
|
reward -= frequency_penalty
|
|
|
|
# Record PnL for risk adjustment
|
|
self.record_pnl(net_profit)
|
|
|
|
else: # HOLD
|
|
# Small reward for holding a profitable position, small cost otherwise
|
|
if is_profitable:
|
|
reward = self._calculate_holding_reward(position_held_time, price_change)
|
|
else:
|
|
reward = -0.0001 # Very small negative reward
|
|
|
|
# Apply risk adjustment if enabled
|
|
if self.risk_adjusted:
|
|
reward = self._calculate_risk_adjustment(reward)
|
|
|
|
# Record this action for future frequency calculations
|
|
self.record_trade(action=action)
|
|
|
|
return reward
|
|
```
|
|
|
|
### Completed: GPU Optimization
|
|
Added GPU optimization in `train_rl_with_realtime.py`:
|
|
```python
|
|
def setup_gpu():
|
|
"""
|
|
Configure GPU usage for PyTorch training
|
|
|
|
Returns:
|
|
tuple: (success, device, message)
|
|
"""
|
|
try:
|
|
if torch.cuda.is_available():
|
|
gpu_count = torch.cuda.device_count()
|
|
device_info = [torch.cuda.get_device_name(i) for i in range(gpu_count)]
|
|
logger.info(f"Found {gpu_count} GPU(s): {', '.join(device_info)}")
|
|
|
|
device = torch.device("cuda:0")
|
|
|
|
# Test CUDA by creating a small tensor
|
|
test_tensor = torch.tensor([1.0, 2.0, 3.0], device=device)
|
|
|
|
# Enable mixed precision if supported
|
|
if hasattr(torch.cuda, 'amp') and torch.cuda.is_bf16_supported():
|
|
logger.info("BFloat16 is supported - enabling for faster training")
|
|
|
|
return True, device, f"GPU enabled: {device_info}"
|
|
else:
|
|
return False, torch.device("cpu"), "GPU not available, using CPU"
|
|
except Exception as e:
|
|
return False, torch.device("cpu"), f"GPU setup failed: {str(e)}"
|
|
```
|
|
|
|
### CNN Price Direction Prediction (To be implemented)
|
|
```python
|
|
def generate_direction_examples(self, historical_data, timeframes=['1m', '1h', '1d']):
|
|
"""Generate price direction examples from historical data"""
|
|
examples = []
|
|
labels = []
|
|
|
|
for tf in timeframes:
|
|
df = historical_data[tf]
|
|
for i in range(20, len(df) - 10):
|
|
# Use window of 20 candles for input
|
|
window = df.iloc[i-20:i]
|
|
|
|
# Create labels for future price direction (next 5, 10, 20 candles)
|
|
future_5 = df.iloc[i].close < df.iloc[i+5].close # True if price goes up
|
|
future_10 = df.iloc[i].close < df.iloc[i+10].close
|
|
future_20 = df.iloc[i].close < df.iloc[min(i+20, len(df)-1)].close
|
|
|
|
examples.append(window.values)
|
|
labels.append([future_5, future_10, future_20])
|
|
|
|
return np.array(examples), np.array(labels)
|
|
```
|
|
|
|
## Validation Plan
|
|
After implementing these improvements, we should validate the system with:
|
|
1. Backtesting on historical data
|
|
2. Forward testing with small position sizes
|
|
3. A/B testing of different reward functions
|
|
4. Measuring the improvement in profitability and Sharpe ratio
|
|
|
|
## Progress Tracking
|
|
- Implementation started: June 2023
|
|
- GPU utilization fixed: July 2023
|
|
- Trade signal rate display implemented: July 2023
|
|
- Reward function optimized: July 2023
|
|
- CNN direction prediction added: To be completed
|
|
- Full system tested: To be completed |