Files
gogo2/.kiro/specs/1.multi-modal-trading-system/tasks.md
Dobromir Popov 0c28a0997c more cleanup
2025-10-13 16:11:06 +03:00

21 KiB

Implementation Plan

Data Provider Backbone Enhancement

Phase 1: Core Data Provider Enhancements

  • 1. Audit and validate existing DataProvider implementation

    • Review core/data_provider.py for completeness and correctness
    • Validate 1500-candle caching is working correctly
    • Verify automatic data maintenance worker is updating properly
    • Test fallback mechanisms between Binance and MEXC
    • Document any gaps or issues found
    • Requirements: 1.1, 1.2, 1.6
  • 1.1. Enhance COB data collection robustness

    • Fix 'NoneType' object has no attribute 'append' errors in _cob_aggregation_worker
    • Add defensive checks before accessing deque structures
    • Implement proper initialization guards to prevent duplicate COB collection starts
    • Add comprehensive error logging for COB data processing failures
    • Test COB collection under various failure scenarios
    • Requirements: 1.3, 1.6
  • 1.2. Implement configurable COB price ranges

    • Replace hardcoded price ranges ($5 ETH, $50 BTC) with configuration
    • Add _get_price_range_for_symbol() configuration support
    • Allow per-symbol price range customization via config.yaml
    • Update COB imbalance calculations to use configurable ranges
    • Document price range selection rationale
    • Requirements: 1.4, 1.1
  • 1.3. Validate and enhance Williams Market Structure pivot calculation

    • Review williams_market_structure.py implementation

    • Verify 5-level pivot detection is working correctly

    • Test monthly 1s data analysis for comprehensive context

    • Add unit tests for pivot point detection accuracy

    • Optimize pivot calculation performance if needed

    • Requirements: 1.5, 2.7

  • 1.4. Implement COB heatmap matrix generation

    • Create get_cob_heatmap_matrix() method in DataProvider
    • Generate time x price matrix for visualization and model input
    • Support configurable time windows (default 300 seconds)
    • Support configurable price bucket radius (default ±10 buckets)
    • Support multiple metrics (imbalance, volume, spread)
    • Cache heatmap data for performance
    • Requirements: 1.4, 1.1
  • 1.5. Enhance EnhancedCOBWebSocket reliability

    • Review enhanced_cob_websocket.py for stability issues
    • Verify proper order book synchronization with REST snapshots
    • Test reconnection logic with exponential backoff
    • Ensure 24-hour connection limit compliance
    • Add comprehensive error handling for all WebSocket streams
    • Requirements: 1.3, 1.6

Phase 2: StandardizedDataProvider Enhancements

  • 2. Implement comprehensive BaseDataInput validation

    • Enhance validate() method in BaseDataInput dataclass
    • Add minimum frame count validation (100 frames per timeframe)
    • Implement data completeness scoring (0.0 to 1.0)
    • Add COB data validation (non-null, valid buckets)
    • Create detailed validation error messages
    • Prevent model inference on incomplete data (completeness < 0.8)
    • Requirements: 1.1.2, 1.1.6
  • 2.1. Integrate COB heatmap into BaseDataInput

    • Add cob_heatmap_times, cob_heatmap_prices, cob_heatmap_values fields
    • Call get_cob_heatmap_matrix() in get_base_data_input()
    • Handle heatmap generation failures gracefully
    • Store heatmap mid_prices in market_microstructure
    • Document heatmap usage for models
    • Requirements: 1.1.1, 1.4
  • 2.2. Enhance COB moving average calculation

    • Review _calculate_cob_moving_averages() for correctness
    • Fix bucket quantization to match COB snapshot buckets
    • Implement nearest-key matching for historical imbalance lookup
    • Add thread-safe access to cob_imbalance_history
    • Optimize MA calculation performance
    • Requirements: 1.1.3, 1.4
  • 2.3. Implement data quality scoring system

    • Create data_quality_score() method
    • Score based on: data completeness, freshness, consistency
    • Add quality thresholds for model inference
    • Log quality metrics for monitoring
    • Provide quality breakdown in BaseDataInput
    • Requirements: 1.1.2, 1.1.6
  • 2.4. Enhance live price fetching robustness

    • Review get_live_price_from_api() fallback chain
    • Add retry logic with exponential backoff
    • Implement circuit breaker for repeated API failures
    • Cache prices with configurable TTL (default 500ms)
    • Log price source for debugging
    • Requirements: 1.6, 1.7

Phase 3: COBY Integration

  • 3. Design unified interface between COBY and core DataProvider

    • Define clear boundaries between COBY and core systems
    • Create adapter layer for accessing COBY data from core
    • Design data flow for multi-exchange aggregation
    • Plan migration path for existing code
    • Document integration architecture
    • Requirements: 1.10, 8.1
  • 3.1. Implement COBY data access adapter

    • Create COBYDataAdapter class in core/
    • Implement methods to query COBY TimescaleDB
    • Add Redis cache integration for performance
    • Support historical data retrieval from COBY
    • Handle COBY unavailability gracefully
    • Requirements: 1.10, 8.1
  • 3.2. Integrate COBY heatmap data

    • Query COBY for multi-exchange heatmap data
    • Merge COBY heatmaps with core COB heatmaps
    • Provide unified heatmap interface to models
    • Support exchange-specific heatmap filtering
    • Cache merged heatmaps for performance
    • Requirements: 1.4, 3.1
  • 3.3. Implement COBY health monitoring

    • Add COBY connection status to DataProvider
    • Monitor COBY API availability
    • Track COBY data freshness
    • Alert on COBY failures
    • Provide COBY status in dashboard
    • Requirements: 1.6, 8.5

Phase 4: Model Output Management

  • 4. Enhance ModelOutputManager functionality

    • Review model_output_manager.py implementation
    • Verify extensible ModelOutput format is working
    • Test cross-model feeding with hidden states
    • Validate historical output storage (1000 entries)
    • Optimize query performance by model_name, symbol, timestamp
    • Requirements: 1.10, 8.2
  • 4.1. Implement model output persistence

    • Add disk-based storage for model outputs
    • Support configurable retention policies
    • Implement efficient serialization (pickle/msgpack)
    • Add compression for storage optimization
    • Support output replay for backtesting
    • Requirements: 1.10, 5.7
  • 4.2. Create model output analytics

    • Track prediction accuracy over time
    • Calculate model agreement/disagreement metrics
    • Identify model performance patterns
    • Generate model comparison reports
    • Visualize model outputs in dashboard
    • Requirements: 5.8, 10.7

Phase 5: Testing and Validation

  • 5. Create comprehensive data provider tests

    • Write unit tests for DataProvider core functionality
    • Test automatic data maintenance worker
    • Test COB aggregation and imbalance calculations
    • Test Williams pivot point detection
    • Test StandardizedDataProvider validation
    • Requirements: 8.1, 8.2
  • 5.1. Implement integration tests

    • Test end-to-end data flow from WebSocket to models
    • Test COBY integration (when implemented)
    • Test model output storage and retrieval
    • Test data provider under load
    • Test failure scenarios and recovery
    • Requirements: 8.2, 8.3
  • 5.2. Create data provider performance benchmarks

    • Measure data collection latency
    • Measure COB aggregation performance
    • Measure BaseDataInput creation time
    • Identify performance bottlenecks
    • Optimize critical paths
    • Requirements: 8.4
  • 5.3. Document data provider architecture

    • Create comprehensive architecture documentation
    • Document data flow diagrams
    • Document configuration options
    • Create troubleshooting guide
    • Add code examples for common use cases
    • Requirements: 8.1, 8.2

Enhanced CNN Model Implementation

  • 6. Enhance the existing CNN model with standardized inputs/outputs

    • Extend the current implementation in NN/models/enhanced_cnn.py
    • Accept standardized COB+OHLCV data frame: 300 frames (1s,1m,1h,1d) ETH + 300s 1s BTC
    • Include COB ±20 buckets and MA (1s,5s,15s,60s) of COB imbalance ±5 buckets
    • Output BUY/SELL trading action with confidence scores
    • Requirements: 2.1, 2.2, 2.8, 1.10
  • 6.1. Implement CNN inference with standardized input format

    • Accept BaseDataInput with standardized COB+OHLCV format
    • Process 300 frames of multi-timeframe data with COB buckets
    • Output BUY/SELL recommendations with confidence scores
    • Make hidden layer states available for cross-model feeding
    • Optimize inference performance for real-time processing
    • Requirements: 2.2, 2.6, 2.8, 4.3
  • 6.2. Enhance CNN training pipeline with checkpoint management

    • Integrate with checkpoint manager for training progress persistence
    • Store top 5-10 best checkpoints based on performance metrics
    • Automatically load best checkpoint at startup
    • Implement training triggers based on orchestrator feedback
    • Store metadata with checkpoints for performance tracking
    • Requirements: 2.4, 2.5, 5.2, 5.3, 5.7
  • 6.3. Implement CNN model evaluation and checkpoint optimization

    • Create evaluation methods using standardized input/output format
    • Implement performance metrics for checkpoint ranking
    • Add validation against historical trading outcomes
    • Support automatic checkpoint cleanup (keep only top performers)
    • Track model improvement over time through checkpoint metadata
    • Requirements: 2.5, 5.8, 4.4

Enhanced RL Model Implementation

  • 7. Enhance the existing RL model with standardized inputs/outputs

    • Extend the current implementation in NN/models/dqn_agent.py
    • Accept standardized COB+OHLCV data frame: 300 frames (1s,1m,1h,1d) ETH + 300s 1s BTC
    • Include COB ±20 buckets and MA (1s,5s,15s,60s) of COB imbalance ±5 buckets
    • Output BUY/SELL trading action with confidence scores
    • Requirements: 3.1, 3.2, 3.7, 1.10
  • 7.1. Implement RL inference with standardized input format

    • Accept BaseDataInput with standardized COB+OHLCV format
    • Process CNN hidden states and predictions as part of state input
    • Output BUY/SELL recommendations with confidence scores
    • Include expected rewards and value estimates in output
    • Optimize inference performance for real-time processing
    • Requirements: 3.2, 3.7, 4.3
  • 7.2. Enhance RL training pipeline with checkpoint management

    • Integrate with checkpoint manager for training progress persistence
    • Store top 5-10 best checkpoints based on trading performance metrics
    • Automatically load best checkpoint at startup
    • Implement experience replay with profitability-based prioritization
    • Store metadata with checkpoints for performance tracking
    • Requirements: 3.3, 3.5, 5.4, 5.7, 4.4
  • 7.3. Implement RL model evaluation and checkpoint optimization

    • Create evaluation methods using standardized input/output format
    • Implement trading performance metrics for checkpoint ranking
    • Add validation against historical trading opportunities
    • Support automatic checkpoint cleanup (keep only top performers)
    • Track model improvement over time through checkpoint metadata
    • Requirements: 3.3, 5.8, 4.4

Enhanced Orchestrator Implementation

  • 8. Enhance the existing orchestrator with centralized coordination

    • Extend the current implementation in core/orchestrator.py
    • Implement DataSubscriptionManager for multi-rate data streams
    • Add ModelInferenceCoordinator for cross-model coordination
    • Create ModelOutputStore for extensible model output management
    • Add TrainingPipelineManager for continuous learning coordination
    • Requirements: 4.1, 4.2, 4.5, 8.1
  • 8.1. Implement data subscription and management system

    • Create DataSubscriptionManager class
    • Subscribe to 10Hz COB data, OHLCV, market ticks, and technical indicators
    • Implement intelligent caching for "last updated" data serving
    • Maintain synchronized base dataframe across different refresh rates
    • Add thread-safe access to multi-rate data streams
    • Requirements: 4.1, 1.6, 8.5
  • 8.2. Implement model inference coordination

    • Create ModelInferenceCoordinator class
    • Trigger model inference based on data availability and requirements
    • Coordinate parallel inference execution for independent models
    • Handle model dependencies (e.g., RL waiting for CNN hidden states)
    • Assemble appropriate input data for each model type
    • Requirements: 4.2, 3.1, 2.1
  • 8.3. Implement model output storage and cross-feeding

    • Create ModelOutputStore class using standardized ModelOutput format
    • Store CNN predictions, confidence scores, and hidden layer states
    • Store RL action recommendations and value estimates
    • Support extensible storage for LSTM, Transformer, and future models
    • Implement cross-model feeding of hidden states and predictions
    • Include "last predictions" from all models in base data input
    • Requirements: 4.3, 1.10, 8.2
  • 8.4. Implement training pipeline management

    • Create TrainingPipelineManager class
    • Call each model's training pipeline with prediction-result pairs
    • Manage training data collection and labeling
    • Coordinate online learning updates based on real-time performance
    • Track prediction accuracy and trigger retraining when needed
    • Requirements: 4.4, 5.2, 5.4, 5.7
  • 8.5. Implement enhanced decision-making with MoE

    • Create enhanced DecisionMaker class
    • Implement Mixture of Experts approach for model integration
    • Apply confidence-based filtering to avoid uncertain trades
    • Support configurable thresholds for buy/sell decisions
    • Consider market conditions and risk parameters in decisions
    • Requirements: 4.5, 4.8, 6.7
  • 8.6. Implement extensible model integration architecture

    • Create MoEGateway class supporting dynamic model addition
    • Support CNN, RL, LSTM, Transformer model types without architecture changes
    • Implement model versioning and rollback capabilities
    • Handle model failures and fallback mechanisms
    • Provide model performance monitoring and alerting
    • Requirements: 4.6, 8.2, 8.3

Model Inference Data Validation and Storage

  • 9. Implement comprehensive inference data validation system

    • Create InferenceDataValidator class for input validation
    • Validate complete OHLCV dataframes for all required timeframes
    • Check input data dimensions against model requirements
    • Log missing components and prevent prediction on incomplete data
    • Requirements: 9.1, 9.2, 9.3, 9.4
  • 9.1. Implement input data validation for all models

    • Create validation methods for CNN, RL, and future model inputs
    • Validate OHLCV data completeness (300 frames for 1s, 1m, 1h, 1d)
    • Validate COB data structure (±20 buckets, MA calculations)
    • Raise specific validation errors with expected vs actual dimensions
    • Ensure validation occurs before any model inference
    • Requirements: 9.1, 9.4
  • 9.2. Implement persistent inference history storage

    • Create InferenceHistoryStore class for persistent storage
    • Store complete input data packages with each prediction
    • Include timestamp, symbol, input features, prediction outputs, confidence scores
    • Store model internal states for cross-model feeding
    • Implement compressed storage to minimize footprint
    • Requirements: 9.5, 9.6
  • 9.3. Implement inference history query and retrieval system

    • Create efficient query mechanisms by symbol, timeframe, and date range
    • Implement data retrieval for training pipeline consumption
    • Add data completeness metrics and validation results in storage
    • Handle storage failures gracefully without breaking prediction flow
    • Requirements: 9.7, 11.6

Inference-Training Feedback Loop Implementation

  • 10. Implement prediction outcome evaluation system

    • Create PredictionOutcomeEvaluator class
    • Evaluate prediction accuracy against actual price movements
    • Create training examples using stored inference data and actual outcomes
    • Feed prediction-result pairs back to respective models
    • Requirements: 10.1, 10.2, 10.3
  • 10.1. Implement adaptive learning signal generation

    • Create positive reinforcement signals for accurate predictions
    • Generate corrective training signals for inaccurate predictions
    • Retrieve last inference data for each model for outcome comparison
    • Implement model-specific learning signal formats
    • Requirements: 10.4, 10.5, 10.6
  • 10.2. Implement continuous improvement tracking

    • Track and report accuracy improvements/degradations over time
    • Monitor model learning progress through feedback loop
    • Create performance metrics for inference-training effectiveness
    • Generate alerts for learning regression or stagnation
    • Requirements: 10.7

Inference History Management and Monitoring

  • 11. Implement comprehensive inference logging and monitoring

    • Create InferenceMonitor class for logging and alerting
    • Log inference data storage operations with completeness metrics
    • Log training outcomes and model performance changes
    • Alert administrators on data flow issues with specific error details
    • Requirements: 11.1, 11.2, 11.3
  • 11.1. Implement configurable retention policies

    • Create RetentionPolicyManager class
    • Archive or remove oldest entries when limits are reached
    • Prioritize keeping most recent and valuable training examples
    • Implement storage space monitoring and alerts
    • Requirements: 11.4, 11.7
  • 11.2. Implement efficient historical data management

    • Compress inference data to minimize storage footprint
    • Maintain accessibility for training and analysis
    • Implement efficient query mechanisms for historical analysis
    • Add data archival and restoration capabilities
    • Requirements: 11.5, 11.6

Trading Executor Implementation

  • 12. Design and implement the trading executor

    • Create a TradingExecutor class that accepts trading actions from the orchestrator
    • Implement order execution through brokerage APIs
    • Add order lifecycle management
    • Requirements: 7.1, 7.2, 8.6
  • 12.1. Implement brokerage API integrations

    • Create a BrokerageAPI interface
    • Implement concrete classes for MEXC and Binance
    • Add error handling and retry mechanisms
    • Requirements: 7.1, 7.2, 8.6
  • 12.2. Implement order management

    • Create an OrderManager class
    • Implement methods for creating, updating, and canceling orders
    • Add order tracking and status updates
    • Requirements: 7.1, 7.2, 8.6
  • 12.3. Implement error handling

    • Add comprehensive error handling for API failures
    • Implement circuit breakers for extreme market conditions
    • Add logging and notification mechanisms
    • Requirements: 7.1, 7.2, 8.6

Risk Manager Implementation

  • 13. Design and implement the risk manager

    • Create a RiskManager class
    • Implement risk parameter management
    • Add risk metric calculation
    • Requirements: 7.1, 7.3, 7.4
  • 13.1. Implement stop-loss functionality

    • Create a StopLossManager class
    • Implement methods for creating and managing stop-loss orders
    • Add mechanisms to automatically close positions when stop-loss is triggered
    • Requirements: 7.1, 7.2
  • 13.2. Implement position sizing

    • Create a PositionSizer class
    • Implement methods for calculating position sizes based on risk parameters
    • Add validation to ensure position sizes are within limits
    • Requirements: 7.3, 7.7
  • 13.3. Implement risk metrics

    • Add methods to calculate risk metrics (drawdown, VaR, etc.)
    • Implement real-time risk monitoring
    • Add alerts for high-risk situations
    • Requirements: 7.4, 7.5, 7.6, 7.8

Dashboard Implementation

  • 14. Design and implement the dashboard UI

    • Create a Dashboard class
    • Implement the web-based UI using Flask/Dash
    • Add real-time updates using WebSockets
    • Requirements: 6.1, 6.8
  • 14.1. Implement chart management

    • Create a ChartManager class
    • Implement methods for creating and updating charts
    • Add interactive features (zoom, pan, etc.)
    • Requirements: 6.1, 6.2
  • 14.2. Implement control panel

    • Create a ControlPanel class
    • Implement start/stop toggles for system processes
    • Add sliders for adjusting buy/sell thresholds
    • Requirements: 6.6, 6.7
  • 14.3. Implement system status display

    • Add methods to display training progress
    • Implement model performance metrics visualization
    • Add real-time system status updates
    • Requirements: 6.5, 5.6
  • 14.4. Implement server-side processing

    • Ensure all processes run on the server without requiring the dashboard to be open
    • Implement background tasks for model training and inference
    • Add mechanisms to persist system state
    • Requirements: 6.8, 5.5

Integration and Testing

  • 15. Integrate all components

    • Connect the data provider to the CNN and RL models
    • Connect the CNN and RL models to the orchestrator
    • Connect the orchestrator to the trading executor
    • Requirements: 8.1, 8.2, 8.3
  • 15.1. Implement comprehensive unit tests

    • Create unit tests for each component
    • Implement test fixtures and mocks
    • Add test coverage reporting
    • Requirements: 8.1, 8.2, 8.3
  • 15.2. Implement integration tests

    • Create tests for component interactions
    • Implement end-to-end tests
    • Add performance benchmarks
    • Requirements: 8.1, 8.2, 8.3
  • 15.3. Implement backtesting framework

    • Create a backtesting environment
    • Implement methods to replay historical data
    • Add performance metrics calculation
    • Requirements: 5.8, 8.1
  • 15.4. Optimize performance

    • Profile the system to identify bottlenecks
    • Implement optimizations for critical paths
    • Add caching and parallelization where appropriate
    • Requirements: 8.1, 8.2, 8.3