148 lines
5.1 KiB
Markdown
148 lines
5.1 KiB
Markdown
# Event-Driven Inference Training System - Implementation Summary
|
|
|
|
## Architecture Decisions
|
|
|
|
### Where Components Fit
|
|
|
|
1. **InferenceTrainingCoordinator** → **TradingOrchestrator**
|
|
- **Rationale**: Orchestrator already manages models, training, and predictions
|
|
- **Benefits**:
|
|
- Reduces duplication (orchestrator has model access)
|
|
- Centralizes coordination logic
|
|
- Reuses existing prediction storage
|
|
- **Location**: `core/orchestrator.py` - initialized in `__init__`
|
|
|
|
2. **DataProvider Subscription Methods** → **DataProvider**
|
|
- **Rationale**: Data layer responsibility - emits events when data changes
|
|
- **Methods Added**:
|
|
- `subscribe_candle_completion()` - Subscribe to candle completion events
|
|
- `subscribe_pivot_events()` - Subscribe to pivot events
|
|
- `_emit_candle_completion()` - Emit event when candle closes
|
|
- `_emit_pivot_event()` - Emit event when pivot detected
|
|
- **Location**: `core/data_provider.py`
|
|
|
|
3. **TrainingEventSubscriber Interface** → **RealTrainingAdapter**
|
|
- **Rationale**: Training layer implements subscriber interface
|
|
- **Methods Implemented**:
|
|
- `on_candle_completion()` - Train on candle completion
|
|
- `on_pivot_event()` - Train on pivot detection
|
|
- **Location**: `ANNOTATE/core/real_training_adapter.py`
|
|
|
|
## Code Duplication Reduction
|
|
|
|
### Before (Duplicated Logic)
|
|
|
|
1. **Data Retrieval**:
|
|
- `_get_realtime_market_data()` in RealTrainingAdapter
|
|
- Similar logic in orchestrator
|
|
- Similar logic in data_provider
|
|
|
|
2. **Prediction Storage**:
|
|
- `store_transformer_prediction()` in orchestrator
|
|
- `inference_input_cache` in RealTrainingAdapter session
|
|
- `prediction_cache` in app.py
|
|
|
|
3. **Training Coordination**:
|
|
- Training logic in RealTrainingAdapter
|
|
- Training logic in orchestrator
|
|
- Training logic in enhanced_realtime_training
|
|
|
|
### After (Centralized)
|
|
|
|
1. **Data Retrieval**:
|
|
- Single source: `data_provider.get_historical_data()` queries DuckDB
|
|
- Coordinator retrieves data on-demand using references
|
|
- No copying - just timestamp ranges
|
|
|
|
2. **Prediction Storage**:
|
|
- Orchestrator's `inference_training_coordinator` manages references
|
|
- References stored in coordinator (not copied)
|
|
- Data retrieved from DuckDB when needed
|
|
|
|
3. **Training Coordination**:
|
|
- Orchestrator's coordinator handles event distribution
|
|
- RealTrainingAdapter implements subscriber interface
|
|
- Single training lock in RealTrainingAdapter
|
|
|
|
## Implementation Status
|
|
|
|
### ✅ Completed
|
|
|
|
1. **InferenceTrainingCoordinator** (`inference_training_system.py`)
|
|
- Reference-based storage
|
|
- Event subscription system
|
|
- Data retrieval from DuckDB
|
|
|
|
2. **DataProvider Extensions** (`data_provider.py`)
|
|
- `subscribe_candle_completion()` method
|
|
- `subscribe_pivot_events()` method
|
|
- `_emit_candle_completion()` method
|
|
- `_emit_pivot_event()` method
|
|
- Event emission in `_update_candle()`
|
|
|
|
3. **Orchestrator Integration** (`orchestrator.py`)
|
|
- Coordinator initialized in `__init__`
|
|
- Accessible via `orchestrator.inference_training_coordinator`
|
|
|
|
4. **RealTrainingAdapter Integration** (`real_training_adapter.py`)
|
|
- Uses orchestrator's coordinator
|
|
- Implements `TrainingEventSubscriber` interface
|
|
- `on_candle_completion()` method
|
|
- `on_pivot_event()` method
|
|
- `_register_inference_frame()` method
|
|
- Helper methods for batch creation
|
|
|
|
### ⚠️ Needs Completion
|
|
|
|
1. **Pivot Event Emission**
|
|
- DataProvider needs to detect pivots and emit events
|
|
- Currently pivots are calculated but not emitted as events
|
|
- Need to integrate with WilliamsMarketStructure pivot detection
|
|
|
|
2. **Norm Params Storage**
|
|
- Currently norm_params are calculated on retrieval
|
|
- Could be stored in reference during registration for efficiency
|
|
- Need to pass norm_params from `_get_realtime_market_data()` to `_register_inference_frame()`
|
|
|
|
3. **Device Handling**
|
|
- Ensure tensors are on correct device when retrieved from DuckDB
|
|
- May need to store device info in reference
|
|
|
|
4. **Testing**
|
|
- Test candle completion events
|
|
- Test pivot events
|
|
- Test data retrieval from DuckDB
|
|
- Test training on inference frames
|
|
|
|
## Key Benefits
|
|
|
|
1. **Memory Efficient**: No copying 600 candles every second
|
|
2. **Event-Driven**: Clean separation of concerns
|
|
3. **Flexible**: Supports time-based (candles) and event-based (pivots)
|
|
4. **Centralized**: Coordinator in orchestrator reduces duplication
|
|
5. **Extensible**: Easy to add new training methods or event types
|
|
|
|
## Next Steps
|
|
|
|
1. **Complete Pivot Event Emission**
|
|
- Add pivot detection in DataProvider
|
|
- Emit events when L2L, L2H, etc. detected
|
|
|
|
2. **Store Norm Params During Registration**
|
|
- Pass norm_params from prediction to registration
|
|
- Store in reference for faster retrieval
|
|
|
|
3. **Add Device Info to References**
|
|
- Store device in InferenceFrameReference
|
|
- Use when creating tensors
|
|
|
|
4. **Remove Old Caching Code**
|
|
- Remove `inference_input_cache` from session
|
|
- Remove `_make_realtime_prediction_with_cache()` (deprecated)
|
|
- Clean up duplicate code
|
|
|
|
5. **Extend DuckDB Schema**
|
|
- Add MA indicators to ohlcv_data
|
|
- Create pivot_points table
|
|
- Store technical indicators
|