wip training
This commit is contained in:
@ -140,7 +140,7 @@ Training:
|
||||
|
||||
### 4. Orchestrator
|
||||
|
||||
The Orchestrator serves as the central coordination hub of the multi-modal trading system, responsible for data subscription management, model inference coordination, output storage, and training pipeline orchestration.
|
||||
The Orchestrator serves as the central coordination hub of the multi-modal trading system, responsible for data subscription management, model inference coordination, output storage, training pipeline orchestration, and inference-training feedback loop management.
|
||||
|
||||
#### Key Classes and Interfaces
|
||||
|
||||
@ -245,6 +245,47 @@ The Orchestrator coordinates training for all models by managing the prediction-
|
||||
- checkpoint manager has capability to ensure only top 5 to 10 best checkpoints are stored for each model deleting the least performant ones. it stores metadata along the CPs to decide the performance
|
||||
- we automatically load the best CP at startup if we have stored ones
|
||||
|
||||
##### 5. Inference Data Validation and Storage
|
||||
|
||||
The Orchestrator implements comprehensive inference data validation and persistent storage:
|
||||
|
||||
**Input Data Validation**:
|
||||
- Validates complete OHLCV dataframes for all required timeframes before inference
|
||||
- Checks input data dimensions against model requirements
|
||||
- Logs missing components and prevents prediction on incomplete data
|
||||
- Raises validation errors with specific details about expected vs actual dimensions
|
||||
|
||||
**Inference History Storage**:
|
||||
- Stores complete input data packages with each prediction in persistent storage
|
||||
- Includes timestamp, symbol, input features, prediction outputs, confidence scores, and model internal states
|
||||
- Maintains compressed storage to minimize footprint while preserving accessibility
|
||||
- Implements efficient query mechanisms by symbol, timeframe, and date range
|
||||
|
||||
**Storage Management**:
|
||||
- Applies configurable retention policies to manage storage limits
|
||||
- Archives or removes oldest entries when limits are reached
|
||||
- Prioritizes keeping most recent and valuable training examples during storage pressure
|
||||
- Provides data completeness metrics and validation results in logs
|
||||
|
||||
##### 6. Inference-Training Feedback Loop
|
||||
|
||||
The Orchestrator manages the continuous learning cycle through inference-training feedback:
|
||||
|
||||
**Prediction Outcome Evaluation**:
|
||||
- Evaluates prediction accuracy against actual price movements after sufficient time has passed
|
||||
- Creates training examples using stored inference data paired with actual market outcomes
|
||||
- Feeds prediction-result pairs back to respective models for learning
|
||||
|
||||
**Adaptive Learning Signals**:
|
||||
- Provides positive reinforcement signals for accurate predictions
|
||||
- Delivers corrective training signals for inaccurate predictions to help models learn from mistakes
|
||||
- Retrieves last inference data for each model to compare predictions against actual outcomes
|
||||
|
||||
**Continuous Improvement Tracking**:
|
||||
- Tracks and reports accuracy improvements or degradations over time
|
||||
- Monitors model learning progress through the feedback loop
|
||||
- Alerts administrators when data flow issues are detected with specific error details and remediation suggestions
|
||||
|
||||
##### 5. Decision Making and Trading Actions
|
||||
|
||||
Beyond coordination, the Orchestrator makes final trading decisions:
|
||||
|
@ -130,4 +130,46 @@ The Multi-Modal Trading System is an advanced algorithmic trading platform that
|
||||
5. WHEN implementing the system architecture THEN the system SHALL use a unified interface for all data providers.
|
||||
6. WHEN implementing the system architecture THEN the system SHALL use a unified interface for all trading executors.
|
||||
7. WHEN implementing the system architecture THEN the system SHALL use a unified interface for all risk management components.
|
||||
8. WHEN implementing the system architecture THEN the system SHALL use a unified interface for all dashboard components.
|
||||
8. WHEN implementing the system architecture THEN the system SHALL use a unified interface for all dashboard components.
|
||||
|
||||
### Requirement 9: Model Inference Data Validation and Storage
|
||||
|
||||
**User Story:** As a trading system developer, I want to ensure that all model predictions include complete input data validation and persistent storage, so that I can verify models receive correct inputs and track their performance over time.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. WHEN a model makes a prediction THEN the system SHALL validate that the input data contains complete OHLCV dataframes for all required timeframes
|
||||
2. WHEN input data is incomplete THEN the system SHALL log the missing components and SHALL NOT proceed with prediction
|
||||
3. WHEN input validation passes THEN the system SHALL store the complete input data package with the prediction in persistent storage
|
||||
4. IF input data dimensions are incorrect THEN the system SHALL raise a validation error with specific details about expected vs actual dimensions
|
||||
5. WHEN a model completes inference THEN the system SHALL store the complete input data, model outputs, confidence scores, and metadata in a persistent inference history
|
||||
6. WHEN storing inference data THEN the system SHALL include timestamp, symbol, input features, prediction outputs, and model internal states
|
||||
7. IF inference history storage fails THEN the system SHALL log the error and continue operation without breaking the prediction flow
|
||||
|
||||
### Requirement 10: Inference-Training Feedback Loop
|
||||
|
||||
**User Story:** As a machine learning engineer, I want the system to automatically train models using their previous inference data compared to actual market outcomes, so that models continuously improve their accuracy through real-world feedback.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. WHEN sufficient time has passed after a prediction THEN the system SHALL evaluate the prediction accuracy against actual price movements
|
||||
2. WHEN a prediction outcome is determined THEN the system SHALL create a training example using the stored inference data and actual outcome
|
||||
3. WHEN training examples are created THEN the system SHALL feed them back to the respective models for learning
|
||||
4. IF the prediction was accurate THEN the system SHALL reinforce the model's decision pathway through positive training signals
|
||||
5. IF the prediction was inaccurate THEN the system SHALL provide corrective training signals to help the model learn from mistakes
|
||||
6. WHEN the system needs training data THEN it SHALL retrieve the last inference data for each model to compare predictions against actual market outcomes
|
||||
7. WHEN models are trained on inference feedback THEN the system SHALL track and report accuracy improvements or degradations over time
|
||||
|
||||
### Requirement 11: Inference History Management and Monitoring
|
||||
|
||||
**User Story:** As a system administrator, I want comprehensive logging and monitoring of the inference-training feedback loop with configurable retention policies, so that I can track model learning progress and manage storage efficiently.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. WHEN inference data is stored THEN the system SHALL log the storage operation with data completeness metrics and validation results
|
||||
2. WHEN training occurs based on previous inference THEN the system SHALL log the training outcome and model performance changes
|
||||
3. WHEN the system detects data flow issues THEN it SHALL alert administrators with specific error details and suggested remediation
|
||||
4. WHEN inference history reaches configured limits THEN the system SHALL archive or remove oldest entries based on retention policy
|
||||
5. WHEN storing inference data THEN the system SHALL compress data to minimize storage footprint while maintaining accessibility
|
||||
6. WHEN retrieving historical inference data THEN the system SHALL provide efficient query mechanisms by symbol, timeframe, and date range
|
||||
7. IF storage space is critically low THEN the system SHALL prioritize keeping the most recent and most valuable training examples
|
@ -135,6 +135,9 @@
|
||||
- Add thread-safe access to multi-rate data streams
|
||||
- _Requirements: 4.1, 1.6, 8.5_
|
||||
|
||||
|
||||
|
||||
|
||||
- [ ] 4.2. Implement model inference coordination
|
||||
- Create ModelInferenceCoordinator class
|
||||
- Trigger model inference based on data availability and requirements
|
||||
@ -176,6 +179,84 @@
|
||||
- Provide model performance monitoring and alerting
|
||||
- _Requirements: 4.6, 8.2, 8.3_
|
||||
|
||||
## Model Inference Data Validation and Storage
|
||||
|
||||
- [ ] 5. Implement comprehensive inference data validation system
|
||||
- Create InferenceDataValidator class for input validation
|
||||
- Validate complete OHLCV dataframes for all required timeframes
|
||||
- Check input data dimensions against model requirements
|
||||
- Log missing components and prevent prediction on incomplete data
|
||||
- _Requirements: 9.1, 9.2, 9.3, 9.4_
|
||||
|
||||
- [ ] 5.1. Implement input data validation for all models
|
||||
- Create validation methods for CNN, RL, and future model inputs
|
||||
- Validate OHLCV data completeness (300 frames for 1s, 1m, 1h, 1d)
|
||||
- Validate COB data structure (±20 buckets, MA calculations)
|
||||
- Raise specific validation errors with expected vs actual dimensions
|
||||
- Ensure validation occurs before any model inference
|
||||
- _Requirements: 9.1, 9.4_
|
||||
|
||||
- [ ] 5.2. Implement persistent inference history storage
|
||||
- Create InferenceHistoryStore class for persistent storage
|
||||
- Store complete input data packages with each prediction
|
||||
- Include timestamp, symbol, input features, prediction outputs, confidence scores
|
||||
- Store model internal states for cross-model feeding
|
||||
- Implement compressed storage to minimize footprint
|
||||
- _Requirements: 9.5, 9.6_
|
||||
|
||||
- [ ] 5.3. Implement inference history query and retrieval system
|
||||
- Create efficient query mechanisms by symbol, timeframe, and date range
|
||||
- Implement data retrieval for training pipeline consumption
|
||||
- Add data completeness metrics and validation results in storage
|
||||
- Handle storage failures gracefully without breaking prediction flow
|
||||
- _Requirements: 9.7, 11.6_
|
||||
|
||||
## Inference-Training Feedback Loop Implementation
|
||||
|
||||
- [ ] 6. Implement prediction outcome evaluation system
|
||||
- Create PredictionOutcomeEvaluator class
|
||||
- Evaluate prediction accuracy against actual price movements
|
||||
- Create training examples using stored inference data and actual outcomes
|
||||
- Feed prediction-result pairs back to respective models
|
||||
- _Requirements: 10.1, 10.2, 10.3_
|
||||
|
||||
- [ ] 6.1. Implement adaptive learning signal generation
|
||||
- Create positive reinforcement signals for accurate predictions
|
||||
- Generate corrective training signals for inaccurate predictions
|
||||
- Retrieve last inference data for each model for outcome comparison
|
||||
- Implement model-specific learning signal formats
|
||||
- _Requirements: 10.4, 10.5, 10.6_
|
||||
|
||||
- [ ] 6.2. Implement continuous improvement tracking
|
||||
- Track and report accuracy improvements/degradations over time
|
||||
- Monitor model learning progress through feedback loop
|
||||
- Create performance metrics for inference-training effectiveness
|
||||
- Generate alerts for learning regression or stagnation
|
||||
- _Requirements: 10.7_
|
||||
|
||||
## Inference History Management and Monitoring
|
||||
|
||||
- [ ] 7. Implement comprehensive inference logging and monitoring
|
||||
- Create InferenceMonitor class for logging and alerting
|
||||
- Log inference data storage operations with completeness metrics
|
||||
- Log training outcomes and model performance changes
|
||||
- Alert administrators on data flow issues with specific error details
|
||||
- _Requirements: 11.1, 11.2, 11.3_
|
||||
|
||||
- [ ] 7.1. Implement configurable retention policies
|
||||
- Create RetentionPolicyManager class
|
||||
- Archive or remove oldest entries when limits are reached
|
||||
- Prioritize keeping most recent and valuable training examples
|
||||
- Implement storage space monitoring and alerts
|
||||
- _Requirements: 11.4, 11.7_
|
||||
|
||||
- [ ] 7.2. Implement efficient historical data management
|
||||
- Compress inference data to minimize storage footprint
|
||||
- Maintain accessibility for training and analysis
|
||||
- Implement efficient query mechanisms for historical analysis
|
||||
- Add data archival and restoration capabilities
|
||||
- _Requirements: 11.5, 11.6_
|
||||
|
||||
## Trading Executor Implementation
|
||||
|
||||
- [ ] 5. Design and implement the trading executor
|
||||
|
Reference in New Issue
Block a user