Dobromir Popov 70eb7bba9b initial movel changes to fix performance

2025-04-02 14:03:20 +03:00

8.7 KiB

Raw Blame History

Cryptocurrency Trading System Improvements

Overview

This document outlines necessary improvements to our cryptocurrency trading system to enhance performance, profitability, and monitoring capabilities.

High Priority Tasks

1. GPU Utilization for Training

Fix GPU detection and utilization during training
- Debug why CUDA is detected but not utilized (check logs showing "Starting training on device: cpu")
- Ensure PyTorch correctly detects and uses available CUDA devices
- Add GPU memory monitoring during training
- Optimize batch sizes for GPU training

Implementation status:

Added setup_gpu() function in train_rl_with_realtime.py to properly detect and configure GPU usage
Added device parameter to DQNAgent to ensure models are created on the correct device
Implemented mixed precision training for faster GPU-based training
Added GPU memory monitoring and logging to TensorBoard

2. Trade Signal Rate Display

Add metrics to track and display trading frequency
- Implement counter for actions per second/minute/hour
- Add visualization to the chart showing trading frequency over time
- Create a moving average of trade signals to show trends
- Add dashboard section showing current and average trading rates

Implementation status:

Added trade time tracking in _add_trade_compat function
Added calculate_trade_rate method to RealTimeChart class
Updated dashboard layout to display trade rates
Added visualization of trade frequency in chart's bottom panel

3. Reward Function Optimization

Revise reward function to better balance profit and risk
- Increase transaction fee penalty for more realistic simulation
- Implement progressive rewards based on holding time
- Add penalty for frequent trading (to reduce noise)
- Scale rewards based on market volatility
- Implement risk-adjusted returns (Sharpe ratio) in reward calculation

Implementation status:

Created improved_reward_function.py with ImprovedRewardCalculator class
Implemented Sharpe ratio for risk-adjusted rewards
Added frequency penalty for excessive trading
Added holding time rewards for profitable positions
Integrated with EnhancedRLTradingEnvironment class

4. Multi-timeframe Price Direction Prediction

Extend CNN model to predict price direction for multiple timeframes
- Modify CNN output to predict short, mid, and long-term price directions
- Create data generation method for back-propagation using historical data
- Implement real-time example generation for training
- Feed direction predictions to RL agent as additional state information

Medium Priority Tasks

5. Position Sizing Optimization

Implement dynamic position sizing based on confidence and volatility
- Add confidence score to model outputs
- Scale position size based on prediction confidence
- Implement Kelly criterion for optimal position sizing

6. Training Data Augmentation

Implement data augmentation for more robust training
- Simulate different market conditions
- Add noise to training data
- Generate synthetic data for rare market events

7. Model Interpretability

Add visualization for model decision making
- Implement feature importance analysis
- Add attention visualization for key price patterns
- Create explainable AI components

Implementation Details

Completed: Displaying Trade Rate

The trade rate display implementation has been completed in the RealTimeChart class:

def calculate_trade_rate(self):
    """Calculate and return trading rate statistics based on recent trades"""
    if not hasattr(self, 'trade_times') or not self.trade_times:
        return {"per_second": 0, "per_minute": 0, "per_hour": 0}
    
    # Get current time
    now = datetime.now()
    
    # Calculate different time windows
    one_second_ago = now - timedelta(seconds=1)
    one_minute_ago = now - timedelta(minutes=1)
    one_hour_ago = now - timedelta(hours=1)
    
    # Count trades in different time windows
    trades_last_second = sum(1 for t in self.trade_times if t > one_second_ago)
    trades_last_minute = sum(1 for t in self.trade_times if t > one_minute_ago)
    trades_last_hour = sum(1 for t in self.trade_times if t > one_hour_ago)
    
    # Calculate rates
    return {
        "per_second": trades_last_second,
        "per_minute": trades_last_minute,
        "per_hour": trades_last_hour
    }

Completed: Improved Reward Function

The improved reward function has been implemented in improved_reward_function.py:

def calculate_reward(self, action, price_change, position_held_time=0, 
                     volatility=None, is_profitable=False):
    """
    Calculate the improved reward with risk adjustment
    """
    # Calculate trading fee
    fee = self.base_fee_rate
    
    # Calculate frequency penalty
    frequency_penalty = self._calculate_frequency_penalty()
    
    # Base reward calculation
    if action == 0:  # BUY
        # Small penalty for transaction plus frequency penalty
        reward = -fee - frequency_penalty
        
    elif action == 1:  # SELL
        # Calculate profit percentage minus fees (both entry and exit)
        profit_pct = price_change
        net_profit = profit_pct - (fee * 2)
        
        # Scale reward and apply frequency penalty
        reward = net_profit * 10  # Scale reward
        reward -= frequency_penalty
        
        # Record PnL for risk adjustment
        self.record_pnl(net_profit)
        
    else:  # HOLD
        # Small reward for holding a profitable position, small cost otherwise
        if is_profitable:
            reward = self._calculate_holding_reward(position_held_time, price_change)
        else:
            reward = -0.0001  # Very small negative reward
    
    # Apply risk adjustment if enabled
    if self.risk_adjusted:
        reward = self._calculate_risk_adjustment(reward)
        
    # Record this action for future frequency calculations
    self.record_trade(action=action)
    
    return reward

Completed: GPU Optimization

Added GPU optimization in train_rl_with_realtime.py:

def setup_gpu():
    """
    Configure GPU usage for PyTorch training
    
    Returns:
        tuple: (success, device, message)
    """
    try:
        if torch.cuda.is_available():
            gpu_count = torch.cuda.device_count()
            device_info = [torch.cuda.get_device_name(i) for i in range(gpu_count)]
            logger.info(f"Found {gpu_count} GPU(s): {', '.join(device_info)}")
            
            device = torch.device("cuda:0")
            
            # Test CUDA by creating a small tensor
            test_tensor = torch.tensor([1.0, 2.0, 3.0], device=device)
            
            # Enable mixed precision if supported
            if hasattr(torch.cuda, 'amp') and torch.cuda.is_bf16_supported():
                logger.info("BFloat16 is supported - enabling for faster training")
            
            return True, device, f"GPU enabled: {device_info}"
        else:
            return False, torch.device("cpu"), "GPU not available, using CPU"
    except Exception as e:
        return False, torch.device("cpu"), f"GPU setup failed: {str(e)}"

CNN Price Direction Prediction (To be implemented)

def generate_direction_examples(self, historical_data, timeframes=['1m', '1h', '1d']):
    """Generate price direction examples from historical data"""
    examples = []
    labels = []
    
    for tf in timeframes:
        df = historical_data[tf]
        for i in range(20, len(df) - 10):
            # Use window of 20 candles for input
            window = df.iloc[i-20:i]
            
            # Create labels for future price direction (next 5, 10, 20 candles)
            future_5 = df.iloc[i].close < df.iloc[i+5].close  # True if price goes up
            future_10 = df.iloc[i].close < df.iloc[i+10].close
            future_20 = df.iloc[i].close < df.iloc[min(i+20, len(df)-1)].close
            
            examples.append(window.values)
            labels.append([future_5, future_10, future_20])
    
    return np.array(examples), np.array(labels)

Validation Plan

After implementing these improvements, we should validate the system with:

Backtesting on historical data
Forward testing with small position sizes
A/B testing of different reward functions
Measuring the improvement in profitability and Sharpe ratio

Progress Tracking

Implementation started: June 2023
GPU utilization fixed: July 2023
Trade signal rate display implemented: July 2023
Reward function optimized: July 2023
CNN direction prediction added: To be completed
Full system tested: To be completed

8.7 KiB Raw Blame History

Cryptocurrency Trading System Improvements

Overview

High Priority Tasks

1. GPU Utilization for Training

2. Trade Signal Rate Display

3. Reward Function Optimization

4. Multi-timeframe Price Direction Prediction

Medium Priority Tasks

5. Position Sizing Optimization

6. Training Data Augmentation

7. Model Interpretability

Implementation Details

Completed: Displaying Trade Rate

Completed: Improved Reward Function

Completed: GPU Optimization

CNN Price Direction Prediction (To be implemented)

Validation Plan

Progress Tracking

8.7 KiB

Raw Blame History