# ANNOTATE - Model Training & Inference Guide ## 🎯 Overview This guide covers how to use the ANNOTATE system for: 1. **Generating Training Data** - From manual annotations 2. **Training Models** - Using annotated test cases 3. **Real-Time Inference** - Live model predictions with streaming data --- ## 📦 Test Case Generation ### Automatic Generation When you save an annotation, a test case is **automatically generated** and saved to disk. **Location**: `ANNOTATE/data/test_cases/annotation_.json` ### What's Included Each test case contains: - **Market State** - OHLCV data for all 4 timeframes (100 candles each) - **Entry/Exit Prices** - Exact prices from annotation - **Expected Outcome** - Direction (LONG/SHORT) and P&L percentage - **Timestamp** - When the trade occurred - **Action** - BUY or SELL signal ### Test Case Format ```json { "test_case_id": "annotation_uuid", "symbol": "ETH/USDT", "timestamp": "2024-01-15T10:30:00Z", "action": "BUY", "market_state": { "ohlcv_1s": { "timestamps": [...], // 100 candles "open": [...], "high": [...], "low": [...], "close": [...], "volume": [...] }, "ohlcv_1m": {...}, // 100 candles "ohlcv_1h": {...}, // 100 candles "ohlcv_1d": {...} // 100 candles }, "expected_outcome": { "direction": "LONG", "profit_loss_pct": 2.5, "entry_price": 2400.50, "exit_price": 2460.75, "holding_period_seconds": 300 } } ``` --- ## 🎓 Model Training ### Available Models The system integrates with your existing models: - **StandardizedCNN** - CNN model for pattern recognition - **DQN** - Deep Q-Network for reinforcement learning - **Transformer** - Transformer model for sequence analysis - **COB** - Order book-based RL model ### Training Process #### Step 1: Create Annotations 1. Mark profitable trades on historical data 2. Test cases are auto-generated and saved 3. Verify test cases exist in `ANNOTATE/data/test_cases/` #### Step 2: Select Model 1. Open training panel (right sidebar) 2. Select model from dropdown 3. Available models are loaded from orchestrator #### Step 3: Start Training 1. Click **"Train Model"** button 2. System loads all test cases from disk 3. Training starts in background thread 4. Progress displayed in real-time #### Step 4: Monitor Progress - **Current Epoch** - Shows training progress - **Loss** - Training loss value - **Status** - Running/Completed/Failed ### Training Details **What Happens During Training:** 1. System loads all test cases from `ANNOTATE/data/test_cases/` 2. Prepares training data (market state → expected outcome) 3. Calls model's training method 4. Updates model weights based on annotations 5. Saves updated model checkpoint **Training Parameters:** - **Epochs**: 10 (configurable) - **Batch Size**: Depends on model - **Learning Rate**: Model-specific - **Data**: All available test cases --- ## Real-Time Inference ### Overview Real-time inference mode runs your trained model on **live streaming data** from the DataProvider, generating predictions in real-time. ### Starting Real-Time Inference #### Step 1: Select Model Choose the model you want to run inference with. #### Step 2: Start Inference 1. Click **"Start Live Inference"** button 2. System loads model from orchestrator 3. Connects to DataProvider's live data stream 4. Begins generating predictions every second #### Step 3: Monitor Signals - **Latest Signal** - BUY/SELL/HOLD - **Confidence** - Model confidence (0-100%) - **Price** - Current market price - **Timestamp** - When signal was generated ### How It Works ``` DataProvider (Live Data) ↓ Latest Market State (4 timeframes) ↓ Model Inference ↓ Prediction (Action + Confidence) ↓ Display on UI + Chart Markers ``` ### Signal Display - Signals appear in training panel - Latest 50 signals stored - Can be displayed on charts (future feature) - Updates every second ### Stopping Inference 1. Click **"Stop Inference"** button 2. Inference loop terminates 3. Final signals remain visible --- ## 🔧 Integration with Orchestrator ### Model Loading Models are loaded directly from the orchestrator: ```python # CNN Model model = orchestrator.cnn_model # DQN Agent model = orchestrator.rl_agent # Transformer model = orchestrator.primary_transformer # COB RL model = orchestrator.cob_rl_agent ``` ### Data Consistency - Uses **same DataProvider** as main system - Same cached data - Same data structure - Perfect consistency --- ## 📊 Training Workflow Example ### Scenario: Train CNN on Breakout Patterns **Step 1: Annotate Trades** ``` 1. Find 10 clear breakout patterns 2. Mark entry/exit for each 3. Test cases auto-generated 4. Result: 10 test cases in ANNOTATE/data/test_cases/ ``` **Step 2: Train Model** ``` 1. Select "StandardizedCNN" from dropdown 2. Click "Train Model" 3. System loads 10 test cases 4. Training runs for 10 epochs 5. Model learns breakout patterns ``` **Step 3: Test with Real-Time Inference** ``` 1. Click "Start Live Inference" 2. Model analyzes live data 3. Generates BUY signals on breakouts 4. Monitor confidence levels 5. Verify model learned correctly ``` --- ## 🎯 Best Practices ### For Training **1. Quality Over Quantity** - Start with 10-20 high-quality annotations - Focus on clear, obvious patterns - Verify each annotation is correct **2. Diverse Scenarios** - Include different market conditions - Mix LONG and SHORT trades - Various timeframes and volatility levels **3. Incremental Training** - Train with small batches first - Verify model learns correctly - Add more annotations gradually **4. Test After Training** - Use real-time inference to verify - Check if model recognizes patterns - Adjust annotations if needed ### For Real-Time Inference **1. Monitor Confidence** - High confidence (>70%) = Strong signal - Medium confidence (50-70%) = Moderate signal - Low confidence (<50%) = Weak signal **2. Verify Against Charts** - Check if signals make sense - Compare with your own analysis - Look for false positives **3. Track Performance** - Note which signals were correct - Identify patterns in errors - Use insights to improve annotations --- ## 🔍 Troubleshooting ### Training Issues **Issue**: "No test cases found" - **Solution**: Create annotations first, test cases are auto-generated **Issue**: Training fails immediately - **Solution**: Check model is loaded in orchestrator, verify test case format **Issue**: Loss not decreasing - **Solution**: May need more/better quality annotations, check data quality ### Inference Issues **Issue**: No signals generated - **Solution**: Verify DataProvider has live data, check model is loaded **Issue**: All signals are HOLD - **Solution**: Model may need more training, check confidence levels **Issue**: Signals don't match expectations - **Solution**: Review training data, may need different annotations --- ## 📈 Performance Metrics ### Training Metrics - **Loss** - Lower is better (target: <0.1) - **Accuracy** - Higher is better (target: >80%) - **Epochs** - More epochs = more learning - **Duration** - Training time in seconds ### Inference Metrics - **Latency** - Time to generate prediction (~1s) - **Confidence** - Model certainty (0-100%) - **Signal Rate** - Predictions per minute - **Accuracy** - Correct predictions vs total --- ## Advanced Usage ### Custom Training Parameters Edit `ANNOTATE/core/training_simulator.py`: ```python 'total_epochs': 10, # Increase for more training ``` ### Model-Specific Training Each model type has its own training method: - `_train_cnn()` - For CNN models - `_train_dqn()` - For DQN agents - `_train_transformer()` - For Transformers - `_train_cob()` - For COB models ### Batch Training Train on specific annotations: ```python # In future: Select specific annotations for training annotation_ids = ['id1', 'id2', 'id3'] ``` --- ## 📝 File Locations ### Test Cases ``` ANNOTATE/data/test_cases/annotation_.json ``` ### Training Results ``` ANNOTATE/data/training_results/ ``` ### Model Checkpoints ``` models/checkpoints/ (main system) ``` --- ## 🎊 Summary The ANNOTATE system provides: **Automatic Test Case Generation** - From annotations **Production-Ready Training** - Integrates with orchestrator **Real-Time Inference** - Live predictions on streaming data **Data Consistency** - Same data as main system **Easy Monitoring** - Real-time progress and signals **You can now:** 1. Mark profitable trades 2. Generate training data automatically 3. Train models with your annotations 4. Test models with real-time inference 5. Monitor model performance live --- **Happy Training!**