From f759eac04b4f1efa14b724d5233615df72893764 Mon Sep 17 00:00:00 2001 From: Dobromir Popov Date: Wed, 23 Jul 2025 13:39:50 +0300 Subject: [PATCH] updated design --- .../multi-modal-trading-system/design.md | 144 +++++++++++++++--- 1 file changed, 127 insertions(+), 17 deletions(-) diff --git a/.kiro/specs/multi-modal-trading-system/design.md b/.kiro/specs/multi-modal-trading-system/design.md index 4089d4d..66fc386 100644 --- a/.kiro/specs/multi-modal-trading-system/design.md +++ b/.kiro/specs/multi-modal-trading-system/design.md @@ -58,9 +58,11 @@ The DataProvider class will: - Identify pivot points - Normalize data - Distribute data to subscribers +- Calculate any other algoritmic manipulations/calculations on the data +- Cache up to 3x the model inputs (300 ticks OHLCV, etc) data so we can do a proper backtesting in up to 2x time in the future Based on the existing implementation in `core/data_provider.py`, we'll enhance it to: -- Improve pivot point calculation using Williams Market Structure +- Improve pivot point calculation using reccursive Williams Market Structure - Optimize data caching for better performance - Enhance real-time data streaming - Implement better error handling and fallback mechanisms @@ -129,33 +131,141 @@ Training: ### 4. Orchestrator -The Orchestrator is responsible for making final trading decisions based on inputs from both CNN and RL models. +The Orchestrator serves as the central coordination hub of the multi-modal trading system, responsible for data subscription management, model inference coordination, output storage, and training pipeline orchestration. #### Key Classes and Interfaces - **Orchestrator**: Main class for the orchestrator. +- **DataSubscriptionManager**: Manages subscriptions to multiple data streams with different refresh rates. +- **ModelInferenceCoordinator**: Coordinates inference across all models. +- **ModelOutputStore**: Stores and manages model outputs for cross-model feeding. +- **TrainingPipelineManager**: Manages training pipelines for all models. - **DecisionMaker**: Interface for making trading decisions. - **MoEGateway**: Mixture of Experts gateway for model integration. +#### Core Responsibilities + +##### 1. Data Subscription and Management + +The Orchestrator subscribes to the Data Provider and manages multiple data streams with varying refresh rates: + +- **10Hz COB (Cumulative Order Book) Data**: High-frequency order book updates for real-time market depth analysis +- **OHLCV Data**: Traditional candlestick data at multiple timeframes (1s, 1m, 1h, 1d) +- **Market Tick Data**: Individual trade executions and price movements +- **Technical Indicators**: Calculated indicators that update at different frequencies +- **Pivot Points**: Market structure analysis data + +**Data Stream Management**: +- Maintains separate buffers for each data type with appropriate retention policies +- Ensures thread-safe access to data streams from multiple models +- Implements intelligent caching to serve "last updated" data efficiently +- Maintains full base dataframe that stays current for any model requesting data +- Handles data synchronization across different refresh rates + +##### 2. Model Inference Coordination + +The Orchestrator coordinates inference across all models in the system: + +**Inference Pipeline**: +- Triggers model inference when relevant data updates occur +- Manages inference scheduling based on data availability and model requirements +- Coordinates parallel inference execution for independent models +- Handles model dependencies (e.g., RL model waiting for CNN hidden states) + +**Model Input Management**: +- Assembles appropriate input data for each model based on their requirements +- Ensures models receive the most current data available at inference time +- Manages feature engineering and data preprocessing for each model +- Handles different input formats and requirements across models + +##### 3. Model Output Storage and Cross-Feeding + +The Orchestrator maintains a centralized store for all model outputs and manages cross-model data feeding: + +**Output Storage**: +- Stores CNN predictions, confidence scores, and hidden layer states +- Stores RL action recommendations and value estimates +- Maintains historical output sequences for temporal analysis +- Implements efficient retrieval mechanisms for real-time access + +**Cross-Model Feeding**: +- Feeds CNN hidden layer states into RL model inputs +- Provides CNN predictions as context for RL decision-making +- Stores model outputs that become inputs for subsequent inference cycles +- Manages circular dependencies and feedback loops between models + +##### 4. Training Pipeline Management + +The Orchestrator coordinates training for all models by managing the prediction-result feedback loop: + +**Training Coordination**: +- Calls each model's training pipeline when new inference results are available +- Provides previous predictions alongside new results for supervised learning +- Manages training data collection and labeling +- Coordinates online learning updates based on real-time performance + +**Training Data Management**: +- Maintains training datasets with prediction-result pairs +- Implements data quality checks and filtering +- Manages training data retention and archival policies +- Provides training data statistics and monitoring + +**Performance Tracking**: +- Tracks prediction accuracy for each model over time +- Monitors model performance degradation and triggers retraining +- Maintains performance metrics for model comparison and selection + +##### 5. Decision Making and Trading Actions + +Beyond coordination, the Orchestrator makes final trading decisions: + +**Decision Integration**: +- Combines outputs from CNN and RL models using Mixture of Experts approach +- Applies confidence-based filtering to avoid uncertain trades +- Implements configurable thresholds for buy/sell decisions +- Considers market conditions and risk parameters + #### Implementation Details -The Orchestrator will: -- Accept inputs from both CNN and RL models -- Output final trading actions (buy/sell) -- Consider confidence levels of both models -- Learn to avoid entering positions when uncertain -- Allow for configurable thresholds for entering and exiting positions +**Architecture**: +```python +class Orchestrator: + def __init__(self): + self.data_subscription_manager = DataSubscriptionManager() + self.model_inference_coordinator = ModelInferenceCoordinator() + self.model_output_store = ModelOutputStore() + self.training_pipeline_manager = TrainingPipelineManager() + self.decision_maker = DecisionMaker() + self.moe_gateway = MoEGateway() + + async def run(self): + # Subscribe to data streams + await self.data_subscription_manager.subscribe_to_data_provider() + + # Start inference coordination loop + await self.model_inference_coordinator.start() + + # Start training pipeline management + await self.training_pipeline_manager.start() +``` -Architecture: -- Mixture of Experts (MoE) approach -- Gating network: Determine which expert to trust -- Expert models: CNN, RL, and potentially others -- Decision network: Combine expert outputs +**Data Flow Management**: +- Implements event-driven architecture for data updates +- Uses async/await patterns for non-blocking operations +- Maintains data freshness timestamps for each stream +- Implements backpressure handling for high-frequency data -Training: -- Train on historical data -- Update model based on trading outcomes -- Use reinforcement learning to optimize decision-making +**Model Coordination**: +- Manages model lifecycle (loading, inference, training, updating) +- Implements model versioning and rollback capabilities +- Handles model failures and fallback mechanisms +- Provides model performance monitoring and alerting + +**Training Integration**: +- Implements incremental learning strategies +- Manages training batch composition and scheduling +- Provides training progress monitoring and control +- Handles training failures and recovery ### 5. Trading Executor