From 2b3c6abdeb15d25767f733bc42ad9b997657c1ac Mon Sep 17 00:00:00 2001
From: Dobromir Popov <dobromir.popov@gateway.one>
Date: Wed, 23 Jul 2025 15:00:08 +0300
Subject: [PATCH] refine design

---
 .../multi-modal-trading-system/design.md      |  22 +-
 .../specs/multi-modal-trading-system/tasks.md | 254 ++++++++++--------
 2 files changed, 154 insertions(+), 122 deletions(-)

diff --git a/.kiro/specs/multi-modal-trading-system/design.md b/.kiro/specs/multi-modal-trading-system/design.md
index 3498d6c..39b5a91 100644
--- a/.kiro/specs/multi-modal-trading-system/design.md
+++ b/.kiro/specs/multi-modal-trading-system/design.md
@@ -12,9 +12,9 @@ The system follows a modular architecture with clear separation of concerns:
 
 ```mermaid
 graph TD
-    A[Data Provider] --> B[Data Processor]
+    A[Data Provider] --> B[Data Processor] (calculates pivot points)
     B --> C[CNN Model]
-    B --> D[RL Model]
+    B --> D[RL(DQN) Model]
     C --> E[Orchestrator]
     D --> E
     E --> F[Trading Executor]
@@ -67,6 +67,13 @@ Based on the existing implementation in `core/data_provider.py`, we'll enhance i
 - Enhance real-time data streaming
 - Implement better error handling and fallback mechanisms
 
+### BASE FOR ALL MODELS ###
+- ***INPUTS***: COB+OHCLV data frame as described: 
+   - OHCLV: 300 frames of (1s, 1m, 1h, 1d) ETH + 300s of 1s BTC
+   - COB: for each 1s OHCLV we have  +- 20 buckets of COB ammounts in USD
+   - 1,5,15 and 60s MA of the COB imbalance counting +- 5 COB buckets
+- ***OUTPUTS***: suggested trade action (BUY/SELL)
+
 ### 2. CNN Model
 
 The CNN Model is responsible for analyzing patterns in market data and predicting pivot points across multiple timeframes.
@@ -76,6 +83,8 @@ The CNN Model is responsible for analyzing patterns in market data and predictin
 - **CNNModel**: Main class for the CNN model.
 - **PivotPointPredictor**: Interface for predicting pivot points.
 - **CNNTrainer**: Class for training the CNN model.
+- ***INPUTS***: COB+OHCLV+Old Pivots (5 levels of pivots)
+- ***OUTPUTS***: next pivot point for each level as price-time vector. (can be plotted as trend line) + suggested trade action (BUY/SELL)
 
 #### Implementation Details
 
@@ -111,13 +120,13 @@ The RL Model is responsible for learning optimal trading strategies based on mar
 #### Implementation Details
 
 The RL Model will:
-- Accept market data, CNN predictions, and CNN hidden layer states as input
+- Accept market data, CNN model predictions (output), and CNN hidden layer states as input
 - Output trading action recommendations (buy/sell)
 - Provide confidence scores for each action
 - Learn from past experiences to adapt to the current market environment
 
 Architecture:
-- State representation: Market data, CNN predictions, CNN hidden layer states
+- State representation: Market data, CNN model predictions (output), CNN hidden layer states
 - Action space: Buy, Sell
 - Reward function: PnL, risk-adjusted returns
 - Policy network: Deep neural network
@@ -231,6 +240,11 @@ The Orchestrator coordinates training for all models by managing the prediction-
 - Monitors model performance degradation and triggers retraining
 - Maintains performance metrics for model comparison and selection
 
+**Training progress and checkpoints persistance**
+- it uses the checkpoint manager to store check points of each model over time as training progresses and we have improvements 
+- checkpoint manager has capability to ensure only top 5 to 10 best checkpoints are stored for each model deleting the least performant ones. it stores metadata along the CPs to decide the performance
+- we automatically load the best CP at startup if we have stored ones
+
 ##### 5. Decision Making and Trading Actions
 
 Beyond coordination, the Orchestrator makes final trading decisions:
diff --git a/.kiro/specs/multi-modal-trading-system/tasks.md b/.kiro/specs/multi-modal-trading-system/tasks.md
index 2c0c265..16b5700 100644
--- a/.kiro/specs/multi-modal-trading-system/tasks.md
+++ b/.kiro/specs/multi-modal-trading-system/tasks.md
@@ -1,148 +1,166 @@
 # Implementation Plan
 
-## Data Provider and Processing
-
-- [ ] 1. Enhance the existing DataProvider class
-
+## Enhanced Data Provider and COB Integration
 
+- [ ] 1. Enhance the existing DataProvider class with standardized model inputs
   - Extend the current implementation in core/data_provider.py
-  - Ensure it supports all required timeframes (1s, 1m, 1h, 1d)
-  - Implement better error handling and fallback mechanisms
+  - Implement standardized COB+OHLCV data frame for all models
+  - Create unified input format: 300 frames OHLCV (1s, 1m, 1h, 1d) ETH + 300s of 1s BTC
+  - Integrate with existing multi_exchange_cob_provider.py for COB data
   - _Requirements: 1.1, 1.2, 1.3, 1.6_
 
-- [ ] 1.1. Implement Williams Market Structure pivot point calculation
-  - Create a dedicated method for identifying pivot points
-  - Implement the recursive pivot point calculation as described
-  - Add unit tests to verify pivot point detection accuracy
+- [ ] 1.1. Implement standardized COB+OHLCV data frame for all models
+  - Create BaseDataInput class with standardized format for all models
+  - Implement OHLCV: 300 frames of (1s, 1m, 1h, 1d) ETH + 300s of 1s BTC
+  - Add COB: ±20 buckets of COB amounts in USD for each 1s OHLCV
+  - Include 1s, 5s, 15s, and 60s MA of COB imbalance counting ±5 COB buckets
+  - Ensure all models receive identical input format for consistency
+  - _Requirements: 1.2, 1.3, 8.1_
+
+- [ ] 1.2. Implement extensible model output storage
+  - Create standardized ModelOutput data structure
+  - Support CNN, RL, LSTM, Transformer, and future model types
+  - Include model-specific predictions and cross-model hidden states
+  - Add metadata support for extensible model information
+  - _Requirements: 1.10, 8.2_
+
+- [ ] 1.3. Enhance Williams Market Structure pivot point calculation
+  - Extend existing williams_market_structure.py implementation
+  - Improve recursive pivot point calculation accuracy
+  - Add unit tests to verify pivot point detection
+  - Integrate with COB data for enhanced pivot detection
   - _Requirements: 1.5, 2.7_
 
-- [ ] 1.2. Optimize data caching for better performance
-  - Implement efficient caching strategies for different timeframes
-  - Add cache invalidation mechanisms
-  - Ensure thread safety for cache access
-  - _Requirements: 1.6, 8.1_
-
-- [-] 1.3. Enhance real-time data streaming
-
-  - Improve WebSocket connection management
-  - Implement reconnection strategies
-  - Add data validation to ensure data integrity
+- [-] 1.4. Optimize real-time data streaming with COB integration
+  - Enhance existing WebSocket connections in enhanced_cob_websocket.py
+  - Implement 10Hz COB data streaming alongside OHLCV data
+  - Add data synchronization across different refresh rates
+  - Ensure thread-safe access to multi-rate data streams
   - _Requirements: 1.6, 8.5_
 
-- [ ] 1.4. Implement data normalization
-  - Normalize data based on the highest timeframe
-  - Ensure relationships between different timeframes are maintained
-  - Add unit tests to verify normalization correctness
-  - _Requirements: 1.8, 2.1_
+## Enhanced CNN Model Implementation
 
-## CNN Model Implementation
+- [ ] 2. Enhance the existing CNN model with standardized inputs/outputs
+  - Extend the current implementation in NN/models/enhanced_cnn.py
+  - Accept standardized COB+OHLCV data frame: 300 frames (1s,1m,1h,1d) ETH + 300s 1s BTC
+  - Include COB ±20 buckets and MA (1s,5s,15s,60s) of COB imbalance ±5 buckets
+  - Output BUY/SELL trading action with confidence scores  - _Requirements: 2.1, 2.2, 2.8, 1.10_
 
-- [ ] 2. Design and implement the CNN model architecture
-  - Create a CNNModel class that accepts multi-timeframe and multi-symbol data
-  - Implement the model using PyTorch or TensorFlow
-  - Design the architecture with convolutional, LSTM/GRU, and attention layers
-  - _Requirements: 2.1, 2.2, 2.8_
+- [ ] 2.1. Implement CNN inference with standardized input format
+  - Accept BaseDataInput with standardized COB+OHLCV format
+  - Process 300 frames of multi-timeframe data with COB buckets
+  - Output BUY/SELL recommendations with confidence scores
+  - Make hidden layer states available for cross-model feeding
+  - Optimize inference performance for real-time processing
+  - _Requirements: 2.2, 2.6, 2.8, 4.3_
 
-- [ ] 2.1. Implement pivot point prediction
-  - Create a PivotPointPredictor class
-  - Implement methods to predict pivot points for each timeframe
-  - Add confidence score calculation for predictions
-  - _Requirements: 2.2, 2.3, 2.6_
-
-- [x] 2.2. Implement CNN training pipeline with comprehensive data storage
-
-
-
-  - Create a CNNTrainer class with training data persistence
-  - Implement methods for training the model on historical data
-  - Add mechanisms to trigger training when new pivot points are detected
-  - Store all training inputs, outputs, gradients, and loss values for replay
-  - Implement training episode storage with profitability metrics
-  - Add capability to replay and retrain on most profitable pivot predictions
+- [x] 2.2. Enhance CNN training pipeline with checkpoint management
+  - Integrate with checkpoint manager for training progress persistence
+  - Store top 5-10 best checkpoints based on performance metrics
+  - Automatically load best checkpoint at startup
+  - Implement training triggers based on orchestrator feedback
+  - Store metadata with checkpoints for performance tracking
   - _Requirements: 2.4, 2.5, 5.2, 5.3, 5.7_
 
-- [ ] 2.3. Implement CNN inference pipeline
-  - Create methods for real-time inference
-  - Ensure hidden layer states are accessible for the RL model
-  - Optimize for performance to minimize latency
-  - _Requirements: 2.2, 2.6, 2.8_
+- [ ] 2.3. Implement CNN model evaluation and checkpoint optimization
+  - Create evaluation methods using standardized input/output format
+  - Implement performance metrics for checkpoint ranking
+  - Add validation against historical trading outcomes
+  - Support automatic checkpoint cleanup (keep only top performers)
+  - Track model improvement over time through checkpoint metadata
+  - _Requirements: 2.5, 5.8, 4.4_
 
-- [ ] 2.4. Implement model evaluation and validation
-  - Create methods to evaluate model performance
-  - Implement metrics for prediction accuracy
-  - Add validation against historical pivot points
-  - _Requirements: 2.5, 5.8_
+## Enhanced RL Model Implementation
 
-## RL Model Implementation
+- [ ] 3. Enhance the existing RL model with standardized inputs/outputs
+  - Extend the current implementation in NN/models/dqn_agent.py
+  - Accept standardized COB+OHLCV data frame: 300 frames (1s,1m,1h,1d) ETH + 300s 1s BTC
+  - Include COB ±20 buckets and MA (1s,5s,15s,60s) of COB imbalance ±5 buckets
+  - Output BUY/SELL trading action with confidence scores
+  - _Requirements: 3.1, 3.2, 3.7, 1.10_
 
-- [ ] 3. Design and implement the RL model architecture
-  - Create an RLModel class that accepts market data and CNN outputs
-  - Implement the model using PyTorch or TensorFlow
-  - Design the architecture with state representation, action space, and reward function
-  - _Requirements: 3.1, 3.2, 3.7_
+- [ ] 3.1. Implement RL inference with standardized input format
+  - Accept BaseDataInput with standardized COB+OHLCV format
+  - Process CNN hidden states and predictions as part of state input
+  - Output BUY/SELL recommendations with confidence scores
+  - Include expected rewards and value estimates in output
+  - Optimize inference performance for real-time processing
+  - _Requirements: 3.2, 3.7, 4.3_
 
-- [ ] 3.1. Implement trading action generation
-  - Create a TradingActionGenerator class
-  - Implement methods to generate buy/sell recommendations
-  - Add confidence score calculation for actions
+- [ ] 3.2. Enhance RL training pipeline with checkpoint management
+  - Integrate with checkpoint manager for training progress persistence
+  - Store top 5-10 best checkpoints based on trading performance metrics
+  - Automatically load best checkpoint at startup
+  - Implement experience replay with profitability-based prioritization
+  - Store metadata with checkpoints for performance tracking
+  - _Requirements: 3.3, 3.5, 5.4, 5.7, 4.4_
 
-
-
-  - _Requirements: 3.2, 3.7_
-
-- [ ] 3.2. Implement RL training pipeline with comprehensive experience storage
-  - Create an RLTrainer class with advanced experience replay
-  - Implement methods for training the model on historical data
-  - Store all training episodes with state-action-reward-next_state tuples
-  - Implement profitability-based experience prioritization
-  - Add capability to replay and retrain on most profitable trading sequences
-  - Store gradient information and model checkpoints for each profitable episode
-  - Implement experience buffer with profit-weighted sampling
-  - _Requirements: 3.3, 3.5, 5.4, 5.7_
-
-- [ ] 3.3. Implement RL inference pipeline
-  - Create methods for real-time inference
-  - Optimize for performance to minimize latency
-  - Ensure proper handling of CNN inputs
-  - _Requirements: 3.1, 3.2, 3.4_
-
-- [ ] 3.4. Implement model evaluation and validation
-  - Create methods to evaluate model performance
-  - Implement metrics for trading performance
+- [ ] 3.3. Implement RL model evaluation and checkpoint optimization
+  - Create evaluation methods using standardized input/output format
+  - Implement trading performance metrics for checkpoint ranking
   - Add validation against historical trading opportunities
-  - _Requirements: 3.3, 5.8_
+  - Support automatic checkpoint cleanup (keep only top performers)
+  - Track model improvement over time through checkpoint metadata
+  - _Requirements: 3.3, 5.8, 4.4_
 
-## Orchestrator Implementation
+## Enhanced Orchestrator Implementation
 
-- [ ] 4. Design and implement the orchestrator architecture
-  - Create an Orchestrator class that accepts inputs from CNN and RL models
-  - Implement the Mixture of Experts (MoE) approach
-  - Design the architecture with gating network and decision network
-  - _Requirements: 4.1, 4.2, 4.5_
+- [ ] 4. Enhance the existing orchestrator with centralized coordination
+  - Extend the current implementation in core/orchestrator.py
+  - Implement DataSubscriptionManager for multi-rate data streams
+  - Add ModelInferenceCoordinator for cross-model coordination
+  - Create ModelOutputStore for extensible model output management
+  - Add TrainingPipelineManager for continuous learning coordination
+  - _Requirements: 4.1, 4.2, 4.5, 8.1_
 
-- [ ] 4.1. Implement decision-making logic
-  - Create a DecisionMaker class
-  - Implement methods to make final trading decisions
-  - Add confidence-based filtering
-  - _Requirements: 4.2, 4.3, 4.4_
+- [ ] 4.1. Implement data subscription and management system
+  - Create DataSubscriptionManager class
+  - Subscribe to 10Hz COB data, OHLCV, market ticks, and technical indicators
+  - Implement intelligent caching for "last updated" data serving
+  - Maintain synchronized base dataframe across different refresh rates
+  - Add thread-safe access to multi-rate data streams
+  - _Requirements: 4.1, 1.6, 8.5_
 
-- [ ] 4.2. Implement MoE gateway
-  - Create a MoEGateway class
-  - Implement methods to determine which expert to trust
-  - Add mechanisms for future model integration
-  - _Requirements: 4.5, 8.2_
+- [ ] 4.2. Implement model inference coordination
+  - Create ModelInferenceCoordinator class
+  - Trigger model inference based on data availability and requirements
+  - Coordinate parallel inference execution for independent models
+  - Handle model dependencies (e.g., RL waiting for CNN hidden states)
+  - Assemble appropriate input data for each model type
+  - _Requirements: 4.2, 3.1, 2.1_
 
-- [ ] 4.3. Implement configurable thresholds
-  - Add parameters for entering and exiting positions
-  - Implement methods to adjust thresholds dynamically
-  - Add validation to ensure thresholds are within reasonable ranges
-  - _Requirements: 4.8, 6.7_
+- [ ] 4.3. Implement model output storage and cross-feeding
+  - Create ModelOutputStore class using standardized ModelOutput format
+  - Store CNN predictions, confidence scores, and hidden layer states
+  - Store RL action recommendations and value estimates
+  - Support extensible storage for LSTM, Transformer, and future models
+  - Implement cross-model feeding of hidden states and predictions
+  - Include "last predictions" from all models in base data input
+  - _Requirements: 4.3, 1.10, 8.2_
 
-- [ ] 4.4. Implement model evaluation and validation
-  - Create methods to evaluate orchestrator performance
-  - Implement metrics for decision quality
-  - Add validation against historical trading decisions
-  - _Requirements: 4.6, 5.8_
+- [ ] 4.4. Implement training pipeline management
+  - Create TrainingPipelineManager class
+  - Call each model's training pipeline with prediction-result pairs
+  - Manage training data collection and labeling
+  - Coordinate online learning updates based on real-time performance
+  - Track prediction accuracy and trigger retraining when needed
+  - _Requirements: 4.4, 5.2, 5.4, 5.7_
+
+- [ ] 4.5. Implement enhanced decision-making with MoE
+  - Create enhanced DecisionMaker class
+  - Implement Mixture of Experts approach for model integration
+  - Apply confidence-based filtering to avoid uncertain trades
+  - Support configurable thresholds for buy/sell decisions
+  - Consider market conditions and risk parameters in decisions
+  - _Requirements: 4.5, 4.8, 6.7_
+
+- [ ] 4.6. Implement extensible model integration architecture
+  - Create MoEGateway class supporting dynamic model addition
+  - Support CNN, RL, LSTM, Transformer model types without architecture changes
+  - Implement model versioning and rollback capabilities
+  - Handle model failures and fallback mechanisms
+  - Provide model performance monitoring and alerting
+  - _Requirements: 4.6, 8.2, 8.3_
 
 ## Trading Executor Implementation