gogo2/DQN_SENSITIVITY_LEARNING_SUMMARY.md
2025-05-27 01:46:15 +03:00

8.6 KiB

DQN RL-based Sensitivity Learning & 300s Data Preloading Summary

Overview

This document summarizes the implementation of DQN RL-based sensitivity learning and 300s data preloading features that make the trading system more adaptive and responsive.

🧠 DQN RL-based Sensitivity Learning

Core Concept

The system now uses a Deep Q-Network (DQN) to learn optimal sensitivity levels for trading decisions based on market conditions and trade outcomes. After each completed trade, the system evaluates the performance and creates a learning case for the DQN agent.

Implementation Details

1. Sensitivity Levels (5 levels: 0-4)

sensitivity_levels = {
    0: {'name': 'very_conservative', 'open_threshold_multiplier': 1.5, 'close_threshold_multiplier': 2.0},
    1: {'name': 'conservative', 'open_threshold_multiplier': 1.2, 'close_threshold_multiplier': 1.5},
    2: {'name': 'medium', 'open_threshold_multiplier': 1.0, 'close_threshold_multiplier': 1.0},
    3: {'name': 'aggressive', 'open_threshold_multiplier': 0.8, 'close_threshold_multiplier': 0.7},
    4: {'name': 'very_aggressive', 'open_threshold_multiplier': 0.6, 'close_threshold_multiplier': 0.5}
}

2. Trade Tracking System

  • Active Trades: Tracks open positions with entry conditions
  • Completed Trades: Records full trade lifecycle with outcomes
  • Learning Queue: Stores DQN training cases from completed trades

3. DQN State Vector (15 features)

  • Market volatility (normalized)
  • Price momentum (5-period)
  • Volume ratio
  • RSI indicator
  • MACD signal
  • Bollinger Band position
  • Recent price changes (5 periods)
  • Current sensitivity level
  • Recent performance metrics (avg P&L, win rate, avg duration)

4. Reward Calculation

def _calculate_sensitivity_reward(self, completed_trade):
    base_reward = pnl_pct * 10  # Scale P&L percentage
    
    # Duration factor
    if duration < 300: duration_factor = 0.8      # Too quick
    elif duration < 1800: duration_factor = 1.2   # Good for scalping
    elif duration < 3600: duration_factor = 1.0   # Acceptable
    else: duration_factor = 0.7                   # Too slow
    
    # Confidence factor
    conf_factor = (entry_conf + exit_conf) / 2 if profitable else exit_conf
    
    final_reward = base_reward * duration_factor * conf_factor
    return np.clip(final_reward, -2.0, 2.0)

5. Dynamic Threshold Adjustment

  • Opening Positions: Higher thresholds (more conservative)
  • Closing Positions: Lower thresholds (more sensitive to exit signals)
  • Real-time Adaptation: DQN continuously adjusts sensitivity based on market conditions

Files Modified

  • core/enhanced_orchestrator.py: Added sensitivity learning methods
  • core/config.py: Added confidence_threshold_close parameter
  • web/scalping_dashboard.py: Added sensitivity info display
  • NN/models/dqn_agent.py: Existing DQN agent used for sensitivity learning

📊 300s Data Preloading

Core Concept

The system now preloads 300 seconds worth of data for all symbols and timeframes on first load, providing better initial performance and reducing latency for trading decisions.

Implementation Details

1. Smart Preloading Logic

def _should_preload_data(self, symbol: str, timeframe: str, limit: int) -> bool:
    # Check if we already have cached data
    if cached_data exists and len(cached_data) > 0:
        return False
    
    # Calculate candles needed for 300s
    timeframe_seconds = self.timeframe_seconds.get(timeframe, 60)
    candles_in_300s = 300 // timeframe_seconds
    
    # Preload if beneficial
    return candles_in_300s > limit or timeframe in ['1s', '1m']

2. Timeframe-Specific Limits

  • 1s timeframe: Max 300 candles (5 minutes)
  • 1m timeframe: Max 60 candles (1 hour)
  • Other timeframes: Max 500 candles
  • Minimum: Always at least 100 candles

3. Preloading Process

  1. Check if data already exists (cache or memory)
  2. Calculate optimal number of candles for 300s
  3. Fetch data from Binance API
  4. Add technical indicators
  5. Cache data for future use
  6. Store in memory for immediate access

4. Performance Benefits

  • Faster Initial Load: Charts populate immediately
  • Reduced API Calls: Bulk loading vs individual requests
  • Better User Experience: No waiting for data on first load
  • Improved Trading Decisions: More historical context available

Files Modified

  • core/data_provider.py: Added preloading methods
  • web/scalping_dashboard.py: Integrated preloading in initialization

🎨 Enhanced Dashboard Features

1. Color-Coded Position Display

  • LONG positions: Green text with [LONG] prefix
  • SHORT positions: Red text with [SHORT] prefix
  • Format: [SIDE] size @ $entry_price | P&L: $unrealized_pnl

2. Enhanced Model Training Status

Now displays three columns:

  • RL Training: Queue size, win rate, actions
  • CNN Training: Perfect moves, confidence, retrospective learning
  • DQN Sensitivity: Current level, completed trades, learning queue, thresholds

3. Sensitivity Learning Info

{
    'level_name': 'MEDIUM',           # Current sensitivity level
    'completed_trades': 15,           # Number of completed trades
    'learning_queue_size': 8,         # DQN training queue size
    'open_threshold': 0.600,          # Current opening threshold
    'close_threshold': 0.250          # Current closing threshold
}

🧪 Testing & Verification

Test Script: test_sensitivity_learning.py

Comprehensive test suite covering:

  1. 300s Data Preloading: Verifies preloading functionality
  2. Sensitivity Learning Initialization: Checks system setup
  3. Trading Scenario Simulation: Tests learning case creation
  4. Threshold Adjustment: Verifies dynamic threshold changes
  5. Dashboard Integration: Tests UI components
  6. DQN Training Simulation: Verifies neural network training

Running Tests

python test_sensitivity_learning.py

Expected output:

🎯 SENSITIVITY LEARNING SYSTEM READY!
Features verified:
  ✅ DQN RL-based sensitivity learning from completed trades
  ✅ 300s data preloading for faster initial performance
  ✅ Dynamic threshold adjustment (lower for closing positions)
  ✅ Color-coded position display ([LONG] green, [SHORT] red)
  ✅ Enhanced model training status with sensitivity info

🚀 Usage Instructions

1. Start the Enhanced Dashboard

python run_enhanced_scalping_dashboard.py

2. Monitor Sensitivity Learning

  • Watch the "DQN Sensitivity" section in the dashboard
  • Observe threshold adjustments as trades complete
  • Monitor learning queue size for training activity

3. Verify Data Preloading

  • Check console logs for preloading status
  • Observe faster initial chart population
  • Monitor reduced API call frequency

📈 Expected Benefits

1. Improved Trading Performance

  • Adaptive Sensitivity: System learns optimal aggressiveness levels
  • Better Exit Timing: Lower thresholds for closing positions
  • Market-Aware Decisions: Sensitivity adjusts to market conditions

2. Enhanced User Experience

  • Faster Startup: 300s preloading reduces initial wait time
  • Visual Clarity: Color-coded positions improve readability
  • Better Monitoring: Enhanced status displays provide more insight

3. System Intelligence

  • Continuous Learning: DQN improves over time
  • Retrospective Analysis: Perfect opportunity detection
  • Performance Optimization: Automatic threshold tuning

🔧 Configuration

Key Parameters

orchestrator:
  confidence_threshold: 0.5        # Base opening threshold
  confidence_threshold_close: 0.25 # Base closing threshold (much lower)
  
sensitivity_learning:
  enabled: true
  state_size: 15
  action_space: 5
  learning_rate: 0.001
  gamma: 0.95
  epsilon: 0.3
  batch_size: 32

📝 Next Steps

  1. Monitor Performance: Track sensitivity learning effectiveness
  2. Tune Parameters: Adjust DQN hyperparameters based on results
  3. Expand Features: Add more market indicators to state vector
  4. Optimize Preloading: Fine-tune preloading amounts per timeframe
  5. Add Persistence: Save/load DQN models between sessions

🎯 Success Metrics

  • Sensitivity Adaptation: DQN successfully adjusts sensitivity levels
  • Improved Win Rate: Better trade outcomes through learned sensitivity
  • Faster Startup: <5 seconds for full data preloading
  • Reduced Latency: Immediate chart updates on dashboard load
  • User Satisfaction: Clear visual feedback and status information

The system now provides intelligent, adaptive trading with enhanced user experience and faster performance!