Files

Dobromir Popov 992d6de25b refactoring. inference real data triggers

2025-12-09 11:59:15 +02:00

6.7 KiB

Raw Blame History

Event-Driven Inference Training System - Architecture & Refactoring

Overview

Implemented a complete event-driven, reference-based inference training system that eliminates code duplication and provides a flexible, robust training pipeline.

Architecture Decisions

Component Placement

1. InferenceTrainingCoordinator → TradingOrchestrator ✅

Rationale:

Orchestrator already manages models, training, and predictions
Centralizes coordination logic
Reduces duplication (orchestrator has model access)
Natural fit for inference-training coordination

Location: core/orchestrator.py (line ~514)

self.inference_training_coordinator = InferenceTrainingCoordinator(
    data_provider=self.data_provider,
    duckdb_storage=self.data_provider.duckdb_storage
)

Benefits:

Single source of truth for inference frame management
Reuses orchestrator's model access
Eliminates duplicate prediction storage logic

2. Event Subscription Methods → DataProvider ✅

Rationale:

Data layer responsibility - emits events when data changes
Natural place for candle completion and pivot detection

Location: core/data_provider.py

subscribe_candle_completion() - Subscribe to candle events
subscribe_pivot_events() - Subscribe to pivot events
_emit_candle_completion() - Emit when candle closes
_emit_pivot_event() - Emit when pivot detected
_check_and_emit_pivot_events() - Check for new pivots

Benefits:

Clean separation of concerns
Event-driven architecture
Easy to extend with new event types

3. TrainingEventSubscriber Interface → RealTrainingAdapter ✅

Rationale:

Training layer implements subscriber interface
Receives callbacks for training events

Location: ANNOTATE/core/real_training_adapter.py

Implements TrainingEventSubscriber
on_candle_completion() - Train on candle completion
on_pivot_event() - Train on pivot detection
Uses orchestrator's coordinator (no duplication)

Benefits:

Clear interface for training adapters
Supports multiple training methods
Easy to add new adapters

Code Duplication Reduction

Before (Duplicated Logic)

Data Retrieval:
- _get_realtime_market_data() in RealTrainingAdapter
- Similar logic in orchestrator
- Similar logic in data_provider
Prediction Storage:
- store_transformer_prediction() in orchestrator
- inference_input_cache in RealTrainingAdapter session (copying 600 candles!)
- prediction_cache in app.py
Training Coordination:
- Training logic scattered across multiple files
- No centralized coordination

After (Centralized)

Data Retrieval:
- Single source: data_provider.get_historical_data() queries DuckDB
- Coordinator retrieves data on-demand using references
- No copying - just timestamp ranges
Prediction Storage:
- Orchestrator's inference_training_coordinator manages references
- References stored (not copied) - just timestamp ranges + norm_params
- Data retrieved from DuckDB when needed
Training Coordination:
- Orchestrator's coordinator handles event distribution
- RealTrainingAdapter implements subscriber interface
- Single training lock in RealTrainingAdapter

Implementation Details

Reference-Based Storage

InferenceFrameReference stores:

data_range_start / data_range_end (timestamp range for 600 candles)
norm_params (small dict - can be stored)
predicted_action, predicted_candle, confidence
target_timestamp (for candles - when result will be available)

No copying - when training is triggered:

Coordinator retrieves data from DuckDB using reference
Normalizes using stored params
Creates training batch
Trains model

Event-Driven Training

Time-Based (Candle Completion)

Candle Closes
    ↓
DataProvider._update_candle() detects new candle
    ↓
_emit_candle_completion() called
    ↓
InferenceTrainingCoordinator._handle_candle_completion()
    ↓
Matches inference frames by target_timestamp
    ↓
Calls subscriber.on_candle_completion(event, inference_ref)
    ↓
RealTrainingAdapter retrieves data from DuckDB
    ↓
Trains model with actual candle result

Event-Based (Pivot Points)

Pivot Detected (L2L, L2H, etc.)
    ↓
DataProvider.get_williams_pivot_levels() calculates pivots
    ↓
_check_and_emit_pivot_events() finds new pivots
    ↓
_emit_pivot_event() called
    ↓
InferenceTrainingCoordinator._handle_pivot_event()
    ↓
Finds matching inference frames (within 5-minute window)
    ↓
Calls subscriber.on_pivot_event(event, inference_refs)
    ↓
RealTrainingAdapter retrieves data from DuckDB
    ↓
Trains model with pivot result

Key Benefits

Memory Efficient: No copying 600 candles every second
Event-Driven: Clean separation of concerns
Flexible: Supports time-based (candles) and event-based (pivots)
Centralized: Coordinator in orchestrator reduces duplication
Extensible: Easy to add new training methods or event types
Robust: Proper error handling and thread safety

Files Modified

ANNOTATE/core/inference_training_system.py (NEW)
- Core system with coordinator and events
core/data_provider.py
- Added subscription methods
- Added event emission
- Added pivot event checking
core/orchestrator.py
- Integrated InferenceTrainingCoordinator
ANNOTATE/core/real_training_adapter.py
- Implements TrainingEventSubscriber
- Uses orchestrator's coordinator
- Removed old caching code (reference-based now)

Next Steps

Test the System
- Test candle completion events
- Test pivot events
- Test data retrieval from DuckDB
- Test training on inference frames
Optimize Pivot Detection
- Add periodic pivot checking (background thread)
- Cache pivot calculations
- Emit events more efficiently
Extend DuckDB Schema
- Add MA indicators to ohlcv_data
- Create pivot_points table
- Store technical indicators
Remove Old Code
- Remove inference_input_cache from session
- Remove _make_realtime_prediction_with_cache() (deprecated)
- Clean up duplicate code

Summary

The system is now:

✅ Memory efficient - No copying 600 candles
✅ Event-driven - Clean architecture
✅ Centralized - Coordinator in orchestrator
✅ Flexible - Supports multiple training methods
✅ Robust - Proper error handling

The refactoring successfully reduces code duplication by:

Centralizing coordination in orchestrator
Using reference-based storage instead of copying
Implementing event-driven architecture
Reusing existing data provider and orchestrator infrastructure

6.7 KiB Raw Blame History

Event-Driven Inference Training System - Architecture & Refactoring

Overview

Architecture Decisions

Component Placement

1. InferenceTrainingCoordinator → TradingOrchestrator ✅

2. Event Subscription Methods → DataProvider ✅

3. TrainingEventSubscriber Interface → RealTrainingAdapter ✅

Code Duplication Reduction

Before (Duplicated Logic)

After (Centralized)

Implementation Details

Reference-Based Storage

Event-Driven Training

Time-Based (Candle Completion)

Event-Based (Pivot Points)

Key Benefits

Files Modified

Next Steps

Summary

6.7 KiB

Raw Blame History