save/load data anotations
This commit is contained in:
363
ANNOTATE/TRAINING_GUIDE.md
Normal file
363
ANNOTATE/TRAINING_GUIDE.md
Normal file
@@ -0,0 +1,363 @@
|
||||
# ANNOTATE - Model Training & Inference Guide
|
||||
|
||||
## 🎯 Overview
|
||||
|
||||
This guide covers how to use the ANNOTATE system for:
|
||||
1. **Generating Training Data** - From manual annotations
|
||||
2. **Training Models** - Using annotated test cases
|
||||
3. **Real-Time Inference** - Live model predictions with streaming data
|
||||
|
||||
---
|
||||
|
||||
## 📦 Test Case Generation
|
||||
|
||||
### Automatic Generation
|
||||
When you save an annotation, a test case is **automatically generated** and saved to disk.
|
||||
|
||||
**Location**: `ANNOTATE/data/test_cases/annotation_<id>.json`
|
||||
|
||||
### What's Included
|
||||
Each test case contains:
|
||||
- ✅ **Market State** - OHLCV data for all 4 timeframes (100 candles each)
|
||||
- ✅ **Entry/Exit Prices** - Exact prices from annotation
|
||||
- ✅ **Expected Outcome** - Direction (LONG/SHORT) and P&L percentage
|
||||
- ✅ **Timestamp** - When the trade occurred
|
||||
- ✅ **Action** - BUY or SELL signal
|
||||
|
||||
### Test Case Format
|
||||
```json
|
||||
{
|
||||
"test_case_id": "annotation_uuid",
|
||||
"symbol": "ETH/USDT",
|
||||
"timestamp": "2024-01-15T10:30:00Z",
|
||||
"action": "BUY",
|
||||
"market_state": {
|
||||
"ohlcv_1s": {
|
||||
"timestamps": [...], // 100 candles
|
||||
"open": [...],
|
||||
"high": [...],
|
||||
"low": [...],
|
||||
"close": [...],
|
||||
"volume": [...]
|
||||
},
|
||||
"ohlcv_1m": {...}, // 100 candles
|
||||
"ohlcv_1h": {...}, // 100 candles
|
||||
"ohlcv_1d": {...} // 100 candles
|
||||
},
|
||||
"expected_outcome": {
|
||||
"direction": "LONG",
|
||||
"profit_loss_pct": 2.5,
|
||||
"entry_price": 2400.50,
|
||||
"exit_price": 2460.75,
|
||||
"holding_period_seconds": 300
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎓 Model Training
|
||||
|
||||
### Available Models
|
||||
The system integrates with your existing models:
|
||||
- **StandardizedCNN** - CNN model for pattern recognition
|
||||
- **DQN** - Deep Q-Network for reinforcement learning
|
||||
- **Transformer** - Transformer model for sequence analysis
|
||||
- **COB** - Order book-based RL model
|
||||
|
||||
### Training Process
|
||||
|
||||
#### Step 1: Create Annotations
|
||||
1. Mark profitable trades on historical data
|
||||
2. Test cases are auto-generated and saved
|
||||
3. Verify test cases exist in `ANNOTATE/data/test_cases/`
|
||||
|
||||
#### Step 2: Select Model
|
||||
1. Open training panel (right sidebar)
|
||||
2. Select model from dropdown
|
||||
3. Available models are loaded from orchestrator
|
||||
|
||||
#### Step 3: Start Training
|
||||
1. Click **"Train Model"** button
|
||||
2. System loads all test cases from disk
|
||||
3. Training starts in background thread
|
||||
4. Progress displayed in real-time
|
||||
|
||||
#### Step 4: Monitor Progress
|
||||
- **Current Epoch** - Shows training progress
|
||||
- **Loss** - Training loss value
|
||||
- **Status** - Running/Completed/Failed
|
||||
|
||||
### Training Details
|
||||
|
||||
**What Happens During Training:**
|
||||
1. System loads all test cases from `ANNOTATE/data/test_cases/`
|
||||
2. Prepares training data (market state → expected outcome)
|
||||
3. Calls model's training method
|
||||
4. Updates model weights based on annotations
|
||||
5. Saves updated model checkpoint
|
||||
|
||||
**Training Parameters:**
|
||||
- **Epochs**: 10 (configurable)
|
||||
- **Batch Size**: Depends on model
|
||||
- **Learning Rate**: Model-specific
|
||||
- **Data**: All available test cases
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Real-Time Inference
|
||||
|
||||
### Overview
|
||||
Real-time inference mode runs your trained model on **live streaming data** from the DataProvider, generating predictions in real-time.
|
||||
|
||||
### Starting Real-Time Inference
|
||||
|
||||
#### Step 1: Select Model
|
||||
Choose the model you want to run inference with.
|
||||
|
||||
#### Step 2: Start Inference
|
||||
1. Click **"Start Live Inference"** button
|
||||
2. System loads model from orchestrator
|
||||
3. Connects to DataProvider's live data stream
|
||||
4. Begins generating predictions every second
|
||||
|
||||
#### Step 3: Monitor Signals
|
||||
- **Latest Signal** - BUY/SELL/HOLD
|
||||
- **Confidence** - Model confidence (0-100%)
|
||||
- **Price** - Current market price
|
||||
- **Timestamp** - When signal was generated
|
||||
|
||||
### How It Works
|
||||
|
||||
```
|
||||
DataProvider (Live Data)
|
||||
↓
|
||||
Latest Market State (4 timeframes)
|
||||
↓
|
||||
Model Inference
|
||||
↓
|
||||
Prediction (Action + Confidence)
|
||||
↓
|
||||
Display on UI + Chart Markers
|
||||
```
|
||||
|
||||
### Signal Display
|
||||
- Signals appear in training panel
|
||||
- Latest 50 signals stored
|
||||
- Can be displayed on charts (future feature)
|
||||
- Updates every second
|
||||
|
||||
### Stopping Inference
|
||||
1. Click **"Stop Inference"** button
|
||||
2. Inference loop terminates
|
||||
3. Final signals remain visible
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Integration with Orchestrator
|
||||
|
||||
### Model Loading
|
||||
Models are loaded directly from the orchestrator:
|
||||
|
||||
```python
|
||||
# CNN Model
|
||||
model = orchestrator.cnn_model
|
||||
|
||||
# DQN Agent
|
||||
model = orchestrator.rl_agent
|
||||
|
||||
# Transformer
|
||||
model = orchestrator.primary_transformer
|
||||
|
||||
# COB RL
|
||||
model = orchestrator.cob_rl_agent
|
||||
```
|
||||
|
||||
### Data Consistency
|
||||
- Uses **same DataProvider** as main system
|
||||
- Same cached data
|
||||
- Same data structure
|
||||
- Perfect consistency
|
||||
|
||||
---
|
||||
|
||||
## 📊 Training Workflow Example
|
||||
|
||||
### Scenario: Train CNN on Breakout Patterns
|
||||
|
||||
**Step 1: Annotate Trades**
|
||||
```
|
||||
1. Find 10 clear breakout patterns
|
||||
2. Mark entry/exit for each
|
||||
3. Test cases auto-generated
|
||||
4. Result: 10 test cases in ANNOTATE/data/test_cases/
|
||||
```
|
||||
|
||||
**Step 2: Train Model**
|
||||
```
|
||||
1. Select "StandardizedCNN" from dropdown
|
||||
2. Click "Train Model"
|
||||
3. System loads 10 test cases
|
||||
4. Training runs for 10 epochs
|
||||
5. Model learns breakout patterns
|
||||
```
|
||||
|
||||
**Step 3: Test with Real-Time Inference**
|
||||
```
|
||||
1. Click "Start Live Inference"
|
||||
2. Model analyzes live data
|
||||
3. Generates BUY signals on breakouts
|
||||
4. Monitor confidence levels
|
||||
5. Verify model learned correctly
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Best Practices
|
||||
|
||||
### For Training
|
||||
|
||||
**1. Quality Over Quantity**
|
||||
- Start with 10-20 high-quality annotations
|
||||
- Focus on clear, obvious patterns
|
||||
- Verify each annotation is correct
|
||||
|
||||
**2. Diverse Scenarios**
|
||||
- Include different market conditions
|
||||
- Mix LONG and SHORT trades
|
||||
- Various timeframes and volatility levels
|
||||
|
||||
**3. Incremental Training**
|
||||
- Train with small batches first
|
||||
- Verify model learns correctly
|
||||
- Add more annotations gradually
|
||||
|
||||
**4. Test After Training**
|
||||
- Use real-time inference to verify
|
||||
- Check if model recognizes patterns
|
||||
- Adjust annotations if needed
|
||||
|
||||
### For Real-Time Inference
|
||||
|
||||
**1. Monitor Confidence**
|
||||
- High confidence (>70%) = Strong signal
|
||||
- Medium confidence (50-70%) = Moderate signal
|
||||
- Low confidence (<50%) = Weak signal
|
||||
|
||||
**2. Verify Against Charts**
|
||||
- Check if signals make sense
|
||||
- Compare with your own analysis
|
||||
- Look for false positives
|
||||
|
||||
**3. Track Performance**
|
||||
- Note which signals were correct
|
||||
- Identify patterns in errors
|
||||
- Use insights to improve annotations
|
||||
|
||||
---
|
||||
|
||||
## 🔍 Troubleshooting
|
||||
|
||||
### Training Issues
|
||||
|
||||
**Issue**: "No test cases found"
|
||||
- **Solution**: Create annotations first, test cases are auto-generated
|
||||
|
||||
**Issue**: Training fails immediately
|
||||
- **Solution**: Check model is loaded in orchestrator, verify test case format
|
||||
|
||||
**Issue**: Loss not decreasing
|
||||
- **Solution**: May need more/better quality annotations, check data quality
|
||||
|
||||
### Inference Issues
|
||||
|
||||
**Issue**: No signals generated
|
||||
- **Solution**: Verify DataProvider has live data, check model is loaded
|
||||
|
||||
**Issue**: All signals are HOLD
|
||||
- **Solution**: Model may need more training, check confidence levels
|
||||
|
||||
**Issue**: Signals don't match expectations
|
||||
- **Solution**: Review training data, may need different annotations
|
||||
|
||||
---
|
||||
|
||||
## 📈 Performance Metrics
|
||||
|
||||
### Training Metrics
|
||||
- **Loss** - Lower is better (target: <0.1)
|
||||
- **Accuracy** - Higher is better (target: >80%)
|
||||
- **Epochs** - More epochs = more learning
|
||||
- **Duration** - Training time in seconds
|
||||
|
||||
### Inference Metrics
|
||||
- **Latency** - Time to generate prediction (~1s)
|
||||
- **Confidence** - Model certainty (0-100%)
|
||||
- **Signal Rate** - Predictions per minute
|
||||
- **Accuracy** - Correct predictions vs total
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Advanced Usage
|
||||
|
||||
### Custom Training Parameters
|
||||
Edit `ANNOTATE/core/training_simulator.py`:
|
||||
```python
|
||||
'total_epochs': 10, # Increase for more training
|
||||
```
|
||||
|
||||
### Model-Specific Training
|
||||
Each model type has its own training method:
|
||||
- `_train_cnn()` - For CNN models
|
||||
- `_train_dqn()` - For DQN agents
|
||||
- `_train_transformer()` - For Transformers
|
||||
- `_train_cob()` - For COB models
|
||||
|
||||
### Batch Training
|
||||
Train on specific annotations:
|
||||
```python
|
||||
# In future: Select specific annotations for training
|
||||
annotation_ids = ['id1', 'id2', 'id3']
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📝 File Locations
|
||||
|
||||
### Test Cases
|
||||
```
|
||||
ANNOTATE/data/test_cases/annotation_<id>.json
|
||||
```
|
||||
|
||||
### Training Results
|
||||
```
|
||||
ANNOTATE/data/training_results/
|
||||
```
|
||||
|
||||
### Model Checkpoints
|
||||
```
|
||||
models/checkpoints/ (main system)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎊 Summary
|
||||
|
||||
The ANNOTATE system provides:
|
||||
|
||||
✅ **Automatic Test Case Generation** - From annotations
|
||||
✅ **Production-Ready Training** - Integrates with orchestrator
|
||||
✅ **Real-Time Inference** - Live predictions on streaming data
|
||||
✅ **Data Consistency** - Same data as main system
|
||||
✅ **Easy Monitoring** - Real-time progress and signals
|
||||
|
||||
**You can now:**
|
||||
1. Mark profitable trades
|
||||
2. Generate training data automatically
|
||||
3. Train models with your annotations
|
||||
4. Test models with real-time inference
|
||||
5. Monitor model performance live
|
||||
|
||||
---
|
||||
|
||||
**Happy Training!** 🚀
|
||||
Reference in New Issue
Block a user