7.1 KiB
7.1 KiB
Event-Driven Inference Training System
Overview
This system provides a flexible, efficient, and robust training pipeline that:
- Stores inference frames by reference (not copying 600 candles every second)
- Uses DuckDB for efficient data storage and retrieval
- Subscribes to events (candle completion, pivot points) for training triggers
- Supports multiple training methods (backprop for Transformer, others for different models)
Architecture
Components
-
InferenceTrainingCoordinator (
inference_training_system.py)- Manages inference frame references
- Matches inference frames to actual results
- Distributes training events to subscribers
-
TrainingEventSubscriber (interface)
- Implemented by training adapters
- Receives callbacks for candle completion and pivot events
-
DataProvider Extensions
subscribe_candle_completion()- Subscribe to candle completion eventssubscribe_pivot_events()- Subscribe to pivot events (L2L, L2H, etc.)
-
DuckDB Storage
- Stores OHLCV data, MA indicators, pivots
- Efficient queries by timestamp range
- No data copying - just references
Data Flow
1. Inference Phase
Model Inference
↓
Create InferenceFrameReference
↓
Store reference (timestamp range, norm_params, prediction metadata)
↓
Register with InferenceTrainingCoordinator
No copying - just store:
data_range_start/data_range_end(timestamp range for 600 candles)norm_params(small dict)predicted_action,predicted_candle,confidencetarget_timestamp(for candles - when result will be available)
2. Training Trigger Phase
Time-Based (Candle Completion)
Candle Closes
↓
DataProvider emits CandleCompletionEvent
↓
InferenceTrainingCoordinator matches inference frames
↓
Calls subscriber.on_candle_completion(event, inference_ref)
↓
Training adapter retrieves data from DuckDB using reference
↓
Train model with actual candle result
Event-Based (Pivot Points)
Pivot Detected (L2L, L2H, etc.)
↓
DataProvider emits PivotEvent
↓
InferenceTrainingCoordinator finds matching inference frames
↓
Calls subscriber.on_pivot_event(event, inference_refs)
↓
Training adapter retrieves data from DuckDB
↓
Train model with pivot result
Implementation Steps
Step 1: Extend DataProvider
Add subscription methods to core/data_provider.py:
def subscribe_candle_completion(self, callback: Callable, symbol: str, timeframe: str):
"""Subscribe to candle completion events"""
# Register callback
# Emit event when candle closes
def subscribe_pivot_events(self, callback: Callable, symbol: str, timeframe: str, pivot_types: List[str]):
"""Subscribe to pivot events (L2L, L2H, etc.)"""
# Register callback
# Emit event when pivot detected
Step 2: Update RealTrainingAdapter
Make RealTrainingAdapter implement TrainingEventSubscriber:
class RealTrainingAdapter(TrainingEventSubscriber):
def __init__(self, ...):
# Initialize InferenceTrainingCoordinator
self.training_coordinator = InferenceTrainingCoordinator(
data_provider=self.data_provider,
duckdb_storage=self.data_provider.duckdb_storage
)
# Subscribe to events
self.training_coordinator.subscribe_to_candle_completion(
self, symbol='ETH/USDT', timeframe='1m'
)
self.training_coordinator.subscribe_to_pivot_events(
self, symbol='ETH/USDT', timeframe='1m',
pivot_types=['L2L', 'L2H', 'L3L', 'L3H']
)
def on_candle_completion(self, event: CandleCompletionEvent, inference_ref: Optional[InferenceFrameReference]):
"""Called when candle completes"""
if not inference_ref:
return # No matching inference frame
# Retrieve inference data from DuckDB
model_inputs = self.training_coordinator.get_inference_data(inference_ref)
if not model_inputs:
return
# Create training batch with actual candle
batch = self._create_training_batch(model_inputs, event.ohlcv, inference_ref)
# Train model (backprop for Transformer, other methods for other models)
self._train_on_batch(batch, inference_ref)
def on_pivot_event(self, event: PivotEvent, inference_refs: List[InferenceFrameReference]):
"""Called when pivot detected"""
for inference_ref in inference_refs:
# Retrieve inference data
model_inputs = self.training_coordinator.get_inference_data(inference_ref)
if not model_inputs:
continue
# Create training batch with pivot result
batch = self._create_pivot_training_batch(model_inputs, event, inference_ref)
# Train model
self._train_on_batch(batch, inference_ref)
Step 3: Update Inference Loop
In _realtime_inference_loop(), register inference frames:
# After making prediction
prediction = self._make_realtime_prediction(...)
# Create inference frame reference
inference_ref = InferenceFrameReference(
inference_id=str(uuid.uuid4()),
symbol=symbol,
timeframe=timeframe,
prediction_timestamp=datetime.now(timezone.utc),
target_timestamp=next_candle_time, # For candles
data_range_start=start_time, # 600 candles before
data_range_end=current_time,
norm_params=norm_params,
predicted_action=prediction['action'],
predicted_candle=prediction['predicted_candle'],
confidence=prediction['confidence']
)
# Register with coordinator (no copying!)
self.training_coordinator.register_inference_frame(inference_ref)
Benefits
- Memory Efficient: No copying 600 candles every second
- Flexible: Supports time-based (candles) and event-based (pivots) training
- Robust: Event-driven architecture with proper error handling
- Simple: Clear separation of concerns
- Scalable: DuckDB handles efficient queries
- Extensible: Easy to add new training methods or event types
DuckDB Schema Extensions
Ensure DuckDB stores:
- OHLCV data (already exists)
- MA indicators (add to ohlcv_data or separate table)
- Pivot points (add pivot_data table)
-- Add technical indicators to ohlcv_data
ALTER TABLE ohlcv_data ADD COLUMN sma_10 DOUBLE;
ALTER TABLE ohlcv_data ADD COLUMN sma_20 DOUBLE;
ALTER TABLE ohlcv_data ADD COLUMN ema_12 DOUBLE;
-- ... etc
-- Create pivot points table
CREATE TABLE IF NOT EXISTS pivot_points (
id INTEGER PRIMARY KEY,
symbol VARCHAR NOT NULL,
timeframe VARCHAR NOT NULL,
timestamp BIGINT NOT NULL,
price DOUBLE NOT NULL,
pivot_type VARCHAR NOT NULL, -- 'L2L', 'L2H', etc.
level INTEGER NOT NULL,
strength DOUBLE NOT NULL,
UNIQUE(symbol, timeframe, timestamp, pivot_type)
);
Next Steps
- Implement DataProvider subscription methods
- Update RealTrainingAdapter to use InferenceTrainingCoordinator
- Extend DuckDB schema for indicators and pivots
- Test with live inference
- Add support for other model types (not just Transformer)