save/load data anotations

2025-10-18 23:44:02 +03:00
parent 7646137f11
commit 002d0f7858
7 changed files with 1563 additions and 69 deletions
--- a/ANNOTATE/TRAINING_GUIDE.md
+++ b/ANNOTATE/TRAINING_GUIDE.md
@@ -0,0 +1,363 @@
+# ANNOTATE - Model Training & Inference Guide
+
+## 🎯 Overview
+
+This guide covers how to use the ANNOTATE system for:
+1. **Generating Training Data** - From manual annotations
+2. **Training Models** - Using annotated test cases
+3. **Real-Time Inference** - Live model predictions with streaming data
+
+---
+
+## 📦 Test Case Generation
+
+### Automatic Generation
+When you save an annotation, a test case is **automatically generated** and saved to disk.
+
+**Location**: `ANNOTATE/data/test_cases/annotation_<id>.json`
+
+### What's Included
+Each test case contains:
+- ✅ **Market State** - OHLCV data for all 4 timeframes (100 candles each)
+- ✅ **Entry/Exit Prices** - Exact prices from annotation
+- ✅ **Expected Outcome** - Direction (LONG/SHORT) and P&L percentage
+- ✅ **Timestamp** - When the trade occurred
+- ✅ **Action** - BUY or SELL signal
+
+### Test Case Format
+```json
+{
+  "test_case_id": "annotation_uuid",
+  "symbol": "ETH/USDT",
+  "timestamp": "2024-01-15T10:30:00Z",
+  "action": "BUY",
+  "market_state": {
+    "ohlcv_1s": {
+      "timestamps": [...],  // 100 candles
+      "open": [...],
+      "high": [...],
+      "low": [...],
+      "close": [...],
+      "volume": [...]
+    },
+    "ohlcv_1m": {...},  // 100 candles
+    "ohlcv_1h": {...},  // 100 candles
+    "ohlcv_1d": {...}   // 100 candles
+  },
+  "expected_outcome": {
+    "direction": "LONG",
+    "profit_loss_pct": 2.5,
+    "entry_price": 2400.50,
+    "exit_price": 2460.75,
+    "holding_period_seconds": 300
+  }
+}
+```
+
+---
+
+## 🎓 Model Training
+
+### Available Models
+The system integrates with your existing models:
+- **StandardizedCNN** - CNN model for pattern recognition
+- **DQN** - Deep Q-Network for reinforcement learning
+- **Transformer** - Transformer model for sequence analysis
+- **COB** - Order book-based RL model
+
+### Training Process
+
+#### Step 1: Create Annotations
+1. Mark profitable trades on historical data
+2. Test cases are auto-generated and saved
+3. Verify test cases exist in `ANNOTATE/data/test_cases/`
+
+#### Step 2: Select Model
+1. Open training panel (right sidebar)
+2. Select model from dropdown
+3. Available models are loaded from orchestrator
+
+#### Step 3: Start Training
+1. Click **"Train Model"** button
+2. System loads all test cases from disk
+3. Training starts in background thread
+4. Progress displayed in real-time
+
+#### Step 4: Monitor Progress
+- **Current Epoch** - Shows training progress
+- **Loss** - Training loss value
+- **Status** - Running/Completed/Failed
+
+### Training Details
+
+**What Happens During Training:**
+1. System loads all test cases from `ANNOTATE/data/test_cases/`
+2. Prepares training data (market state → expected outcome)
+3. Calls model's training method
+4. Updates model weights based on annotations
+5. Saves updated model checkpoint
+
+**Training Parameters:**
+- **Epochs**: 10 (configurable)
+- **Batch Size**: Depends on model
+- **Learning Rate**: Model-specific
+- **Data**: All available test cases
+
+---
+
+## 🚀 Real-Time Inference
+
+### Overview
+Real-time inference mode runs your trained model on **live streaming data** from the DataProvider, generating predictions in real-time.
+
+### Starting Real-Time Inference
+
+#### Step 1: Select Model
+Choose the model you want to run inference with.
+
+#### Step 2: Start Inference
+1. Click **"Start Live Inference"** button
+2. System loads model from orchestrator
+3. Connects to DataProvider's live data stream
+4. Begins generating predictions every second
+
+#### Step 3: Monitor Signals
+- **Latest Signal** - BUY/SELL/HOLD
+- **Confidence** - Model confidence (0-100%)
+- **Price** - Current market price
+- **Timestamp** - When signal was generated
+
+### How It Works
+
+```
+DataProvider (Live Data)
+    ↓
+Latest Market State (4 timeframes)
+    ↓
+Model Inference
+    ↓
+Prediction (Action + Confidence)
+    ↓
+Display on UI + Chart Markers
+```
+
+### Signal Display
+- Signals appear in training panel
+- Latest 50 signals stored
+- Can be displayed on charts (future feature)
+- Updates every second
+
+### Stopping Inference
+1. Click **"Stop Inference"** button
+2. Inference loop terminates
+3. Final signals remain visible
+
+---
+
+## 🔧 Integration with Orchestrator
+
+### Model Loading
+Models are loaded directly from the orchestrator:
+
+```python
+# CNN Model
+model = orchestrator.cnn_model
+
+# DQN Agent
+model = orchestrator.rl_agent
+
+# Transformer
+model = orchestrator.primary_transformer
+
+# COB RL
+model = orchestrator.cob_rl_agent
+```
+
+### Data Consistency
+- Uses **same DataProvider** as main system
+- Same cached data
+- Same data structure
+- Perfect consistency
+
+---
+
+## 📊 Training Workflow Example
+
+### Scenario: Train CNN on Breakout Patterns
+
+**Step 1: Annotate Trades**
+```
+1. Find 10 clear breakout patterns
+2. Mark entry/exit for each
+3. Test cases auto-generated
+4. Result: 10 test cases in ANNOTATE/data/test_cases/
+```
+
+**Step 2: Train Model**
+```
+1. Select "StandardizedCNN" from dropdown
+2. Click "Train Model"
+3. System loads 10 test cases
+4. Training runs for 10 epochs
+5. Model learns breakout patterns
+```
+
+**Step 3: Test with Real-Time Inference**
+```
+1. Click "Start Live Inference"
+2. Model analyzes live data
+3. Generates BUY signals on breakouts
+4. Monitor confidence levels
+5. Verify model learned correctly
+```
+
+---
+
+## 🎯 Best Practices
+
+### For Training
+
+**1. Quality Over Quantity**
+- Start with 10-20 high-quality annotations
+- Focus on clear, obvious patterns
+- Verify each annotation is correct
+
+**2. Diverse Scenarios**
+- Include different market conditions
+- Mix LONG and SHORT trades
+- Various timeframes and volatility levels
+
+**3. Incremental Training**
+- Train with small batches first
+- Verify model learns correctly
+- Add more annotations gradually
+
+**4. Test After Training**
+- Use real-time inference to verify
+- Check if model recognizes patterns
+- Adjust annotations if needed
+
+### For Real-Time Inference
+
+**1. Monitor Confidence**
+- High confidence (>70%) = Strong signal
+- Medium confidence (50-70%) = Moderate signal
+- Low confidence (<50%) = Weak signal
+
+**2. Verify Against Charts**
+- Check if signals make sense
+- Compare with your own analysis
+- Look for false positives
+
+**3. Track Performance**
+- Note which signals were correct
+- Identify patterns in errors
+- Use insights to improve annotations
+
+---
+
+## 🔍 Troubleshooting
+
+### Training Issues
+
+**Issue**: "No test cases found"
+- **Solution**: Create annotations first, test cases are auto-generated
+
+**Issue**: Training fails immediately
+- **Solution**: Check model is loaded in orchestrator, verify test case format
+
+**Issue**: Loss not decreasing
+- **Solution**: May need more/better quality annotations, check data quality
+
+### Inference Issues
+
+**Issue**: No signals generated
+- **Solution**: Verify DataProvider has live data, check model is loaded
+
+**Issue**: All signals are HOLD
+- **Solution**: Model may need more training, check confidence levels
+
+**Issue**: Signals don't match expectations
+- **Solution**: Review training data, may need different annotations
+
+---
+
+## 📈 Performance Metrics
+
+### Training Metrics
+- **Loss** - Lower is better (target: <0.1)
+- **Accuracy** - Higher is better (target: >80%)
+- **Epochs** - More epochs = more learning
+- **Duration** - Training time in seconds
+
+### Inference Metrics
+- **Latency** - Time to generate prediction (~1s)
+- **Confidence** - Model certainty (0-100%)
+- **Signal Rate** - Predictions per minute
+- **Accuracy** - Correct predictions vs total
+
+---
+
+## 🚀 Advanced Usage
+
+### Custom Training Parameters
+Edit `ANNOTATE/core/training_simulator.py`:
+```python
+'total_epochs': 10,  # Increase for more training
+```
+
+### Model-Specific Training
+Each model type has its own training method:
+- `_train_cnn()` - For CNN models
+- `_train_dqn()` - For DQN agents
+- `_train_transformer()` - For Transformers
+- `_train_cob()` - For COB models
+
+### Batch Training
+Train on specific annotations:
+```python
+# In future: Select specific annotations for training
+annotation_ids = ['id1', 'id2', 'id3']
+```
+
+---
+
+## 📝 File Locations
+
+### Test Cases
+```
+ANNOTATE/data/test_cases/annotation_<id>.json
+```
+
+### Training Results
+```
+ANNOTATE/data/training_results/
+```
+
+### Model Checkpoints
+```
+models/checkpoints/  (main system)
+```
+
+---
+
+## 🎊 Summary
+
+The ANNOTATE system provides:
+
+✅ **Automatic Test Case Generation** - From annotations  
+✅ **Production-Ready Training** - Integrates with orchestrator  
+✅ **Real-Time Inference** - Live predictions on streaming data  
+✅ **Data Consistency** - Same data as main system  
+✅ **Easy Monitoring** - Real-time progress and signals  
+
+**You can now:**
+1. Mark profitable trades
+2. Generate training data automatically
+3. Train models with your annotations
+4. Test models with real-time inference
+5. Monitor model performance live
+
+---
+
+**Happy Training!** 🚀