# Architecture Refactoring Plan ## Current Issues ### 1. Duplicate Core Implementations - **ANNOTATE/core/data_loader.py** vs **core/data_provider.py** - overlapping data loading - **ANNOTATE/core/inference_training_system.py** vs **core/orchestrator.py** - overlapping training coordination - **ANNOTATE/core/real_training_adapter.py** - should be in main core - Multiple data models scattered across both cores ### 2. Import Dependencies - Main core imports from ANNOTATE/core (wrong direction) - Circular dependencies between systems - Inconsistent data flow ### 3. Responsibilities Overlap - Both orchestrator and InferenceTrainingCoordinator handle training - Both data_provider and data_loader handle data fetching - Duplicate model management ## Refactoring Strategy ### Phase 1: Move Core Classes to Main Core #### 1.1 Move InferenceFrameReference to core/data_models.py ```python # Move from: ANNOTATE/core/inference_training_system.py # To: core/data_models.py @dataclass class InferenceFrameReference: # ... existing implementation ``` #### 1.2 Integrate InferenceTrainingCoordinator into Orchestrator ```python # In core/orchestrator.py - merge functionality instead of importing class TradingOrchestrator: def __init__(self): # Integrate training coordination directly self.training_event_subscribers = [] self.inference_frames = {} # ... merge InferenceTrainingCoordinator methods ``` #### 1.3 Move RealTrainingAdapter to Main Core ```python # Move from: ANNOTATE/core/real_training_adapter.py # To: core/enhanced_rl_training_adapter.py (extend existing) ``` ### Phase 2: Eliminate ANNOTATE/core/data_loader.py #### 2.1 Extend Main DataProvider ```python # In core/data_provider.py - add methods from HistoricalDataLoader class DataProvider: def get_data_for_annotation(self, symbol, timeframe, start_time=None, end_time=None, limit=2500, direction='latest'): """Method specifically for annotation UI needs""" # Implement annotation-specific data loading def get_multi_timeframe_data(self, symbol, timeframes, start_time=None, end_time=None, limit=2500): """Multi-timeframe data for annotation UI""" # Implement multi-timeframe loading ``` #### 2.2 Update ANNOTATE App ```python # In ANNOTATE/web/app.py from core.data_provider import DataProvider # Use main data provider directly class AnnotationDashboard: def __init__(self): # Use main data provider instead of wrapper self.data_provider = DataProvider(config) ``` ### Phase 3: Consolidate Training Systems #### 3.1 Merge Training Responsibilities ```python # In core/orchestrator.py class TradingOrchestrator: def subscribe_training_events(self, callback, event_types): """Unified training event subscription""" def store_inference_frame(self, symbol, timeframe, prediction_data): """Store inference frames for training""" def trigger_training_on_event(self, event_type, event_data): """Unified training trigger system""" ``` #### 3.2 Remove Duplicate Classes - Delete ANNOTATE/core/inference_training_system.py - Delete ANNOTATE/core/data_loader.py - Move useful methods to main core classes ### Phase 4: Clean Architecture #### 4.1 Single Data Flow ``` Exchange APIs → DataProvider → Orchestrator → Models ↓ ↓ ANNOTATE UI ← Training System ``` #### 4.2 Clear Responsibilities - **core/data_provider.py**: All data fetching, caching, real-time integration - **core/orchestrator.py**: All model coordination, training events, inference - **core/data_models.py**: All shared data structures - **ANNOTATE/**: UI only, no core logic ## Implementation Steps ### Step 1: Move InferenceFrameReference 1. Copy class to core/data_models.py 2. Update imports in orchestrator 3. Remove from ANNOTATE/core/ ### Step 2: Integrate Training Coordination 1. Move InferenceTrainingCoordinator methods into orchestrator 2. Update ANNOTATE app to use orchestrator directly 3. Remove duplicate training system ### Step 3: Extend DataProvider 1. Add annotation-specific methods to main DataProvider 2. Update ANNOTATE app to use main DataProvider 3. Remove ANNOTATE/core/data_loader.py ### Step 4: Clean Up 1. Remove ANNOTATE/core/ directory entirely 2. Update all imports 3. Test live data flow ## Expected Benefits ### 1. Single Source of Truth - One DataProvider handling all data - One Orchestrator handling all training - One set of data models ### 2. Proper Live Data Flow - WebSocket → DataProvider → API → Charts - No duplicate caching or stale data ### 3. Cleaner Architecture - ANNOTATE becomes pure UI - Core contains all business logic - Clear dependency direction ### 4. Easier Maintenance - No duplicate code to maintain - Single place to fix issues - Consistent behavior across apps ## Files to Modify ### Move/Merge: - ANNOTATE/core/inference_training_system.py → core/orchestrator.py - ANNOTATE/core/real_training_adapter.py → core/enhanced_rl_training_adapter.py - InferenceFrameReference → core/data_models.py ### Update: - ANNOTATE/web/app.py (use main core classes) - core/orchestrator.py (integrate training coordination) - core/data_provider.py (add annotation methods) ### Delete: - ANNOTATE/core/data_loader.py - ANNOTATE/core/inference_training_system.py (after merge) - Entire ANNOTATE/core/ directory (eventually)