Files
gogo2/ANNOTATE/TRAINING_GUIDE.md
2025-10-25 16:35:08 +03:00

8.5 KiB

ANNOTATE - Model Training & Inference Guide

🎯 Overview

This guide covers how to use the ANNOTATE system for:

  1. Generating Training Data - From manual annotations
  2. Training Models - Using annotated test cases
  3. Real-Time Inference - Live model predictions with streaming data

📦 Test Case Generation

Automatic Generation

When you save an annotation, a test case is automatically generated and saved to disk.

Location: ANNOTATE/data/test_cases/annotation_<id>.json

What's Included

Each test case contains:

  • Market State - OHLCV data for all 4 timeframes (100 candles each)
  • Entry/Exit Prices - Exact prices from annotation
  • Expected Outcome - Direction (LONG/SHORT) and P&L percentage
  • Timestamp - When the trade occurred
  • Action - BUY or SELL signal

Test Case Format

{
  "test_case_id": "annotation_uuid",
  "symbol": "ETH/USDT",
  "timestamp": "2024-01-15T10:30:00Z",
  "action": "BUY",
  "market_state": {
    "ohlcv_1s": {
      "timestamps": [...],  // 100 candles
      "open": [...],
      "high": [...],
      "low": [...],
      "close": [...],
      "volume": [...]
    },
    "ohlcv_1m": {...},  // 100 candles
    "ohlcv_1h": {...},  // 100 candles
    "ohlcv_1d": {...}   // 100 candles
  },
  "expected_outcome": {
    "direction": "LONG",
    "profit_loss_pct": 2.5,
    "entry_price": 2400.50,
    "exit_price": 2460.75,
    "holding_period_seconds": 300
  }
}

🎓 Model Training

Available Models

The system integrates with your existing models:

  • StandardizedCNN - CNN model for pattern recognition
  • DQN - Deep Q-Network for reinforcement learning
  • Transformer - Transformer model for sequence analysis
  • COB - Order book-based RL model

Training Process

Step 1: Create Annotations

  1. Mark profitable trades on historical data
  2. Test cases are auto-generated and saved
  3. Verify test cases exist in ANNOTATE/data/test_cases/

Step 2: Select Model

  1. Open training panel (right sidebar)
  2. Select model from dropdown
  3. Available models are loaded from orchestrator

Step 3: Start Training

  1. Click "Train Model" button
  2. System loads all test cases from disk
  3. Training starts in background thread
  4. Progress displayed in real-time

Step 4: Monitor Progress

  • Current Epoch - Shows training progress
  • Loss - Training loss value
  • Status - Running/Completed/Failed

Training Details

What Happens During Training:

  1. System loads all test cases from ANNOTATE/data/test_cases/
  2. Prepares training data (market state → expected outcome)
  3. Calls model's training method
  4. Updates model weights based on annotations
  5. Saves updated model checkpoint

Training Parameters:

  • Epochs: 10 (configurable)
  • Batch Size: Depends on model
  • Learning Rate: Model-specific
  • Data: All available test cases

Real-Time Inference

Overview

Real-time inference mode runs your trained model on live streaming data from the DataProvider, generating predictions in real-time.

Starting Real-Time Inference

Step 1: Select Model

Choose the model you want to run inference with.

Step 2: Start Inference

  1. Click "Start Live Inference" button
  2. System loads model from orchestrator
  3. Connects to DataProvider's live data stream
  4. Begins generating predictions every second

Step 3: Monitor Signals

  • Latest Signal - BUY/SELL/HOLD
  • Confidence - Model confidence (0-100%)
  • Price - Current market price
  • Timestamp - When signal was generated

How It Works

DataProvider (Live Data)
    ↓
Latest Market State (4 timeframes)
    ↓
Model Inference
    ↓
Prediction (Action + Confidence)
    ↓
Display on UI + Chart Markers

Signal Display

  • Signals appear in training panel
  • Latest 50 signals stored
  • Can be displayed on charts (future feature)
  • Updates every second

Stopping Inference

  1. Click "Stop Inference" button
  2. Inference loop terminates
  3. Final signals remain visible

🔧 Integration with Orchestrator

Model Loading

Models are loaded directly from the orchestrator:

# CNN Model
model = orchestrator.cnn_model

# DQN Agent
model = orchestrator.rl_agent

# Transformer
model = orchestrator.primary_transformer

# COB RL
model = orchestrator.cob_rl_agent

Data Consistency

  • Uses same DataProvider as main system
  • Same cached data
  • Same data structure
  • Perfect consistency

📊 Training Workflow Example

Scenario: Train CNN on Breakout Patterns

Step 1: Annotate Trades

1. Find 10 clear breakout patterns
2. Mark entry/exit for each
3. Test cases auto-generated
4. Result: 10 test cases in ANNOTATE/data/test_cases/

Step 2: Train Model

1. Select "StandardizedCNN" from dropdown
2. Click "Train Model"
3. System loads 10 test cases
4. Training runs for 10 epochs
5. Model learns breakout patterns

Step 3: Test with Real-Time Inference

1. Click "Start Live Inference"
2. Model analyzes live data
3. Generates BUY signals on breakouts
4. Monitor confidence levels
5. Verify model learned correctly

🎯 Best Practices

For Training

1. Quality Over Quantity

  • Start with 10-20 high-quality annotations
  • Focus on clear, obvious patterns
  • Verify each annotation is correct

2. Diverse Scenarios

  • Include different market conditions
  • Mix LONG and SHORT trades
  • Various timeframes and volatility levels

3. Incremental Training

  • Train with small batches first
  • Verify model learns correctly
  • Add more annotations gradually

4. Test After Training

  • Use real-time inference to verify
  • Check if model recognizes patterns
  • Adjust annotations if needed

For Real-Time Inference

1. Monitor Confidence

  • High confidence (>70%) = Strong signal
  • Medium confidence (50-70%) = Moderate signal
  • Low confidence (<50%) = Weak signal

2. Verify Against Charts

  • Check if signals make sense
  • Compare with your own analysis
  • Look for false positives

3. Track Performance

  • Note which signals were correct
  • Identify patterns in errors
  • Use insights to improve annotations

🔍 Troubleshooting

Training Issues

Issue: "No test cases found"

  • Solution: Create annotations first, test cases are auto-generated

Issue: Training fails immediately

  • Solution: Check model is loaded in orchestrator, verify test case format

Issue: Loss not decreasing

  • Solution: May need more/better quality annotations, check data quality

Inference Issues

Issue: No signals generated

  • Solution: Verify DataProvider has live data, check model is loaded

Issue: All signals are HOLD

  • Solution: Model may need more training, check confidence levels

Issue: Signals don't match expectations

  • Solution: Review training data, may need different annotations

📈 Performance Metrics

Training Metrics

  • Loss - Lower is better (target: <0.1)
  • Accuracy - Higher is better (target: >80%)
  • Epochs - More epochs = more learning
  • Duration - Training time in seconds

Inference Metrics

  • Latency - Time to generate prediction (~1s)
  • Confidence - Model certainty (0-100%)
  • Signal Rate - Predictions per minute
  • Accuracy - Correct predictions vs total

Advanced Usage

Custom Training Parameters

Edit ANNOTATE/core/training_simulator.py:

'total_epochs': 10,  # Increase for more training

Model-Specific Training

Each model type has its own training method:

  • _train_cnn() - For CNN models
  • _train_dqn() - For DQN agents
  • _train_transformer() - For Transformers
  • _train_cob() - For COB models

Batch Training

Train on specific annotations:

# In future: Select specific annotations for training
annotation_ids = ['id1', 'id2', 'id3']

📝 File Locations

Test Cases

ANNOTATE/data/test_cases/annotation_<id>.json

Training Results

ANNOTATE/data/training_results/

Model Checkpoints

models/checkpoints/  (main system)

🎊 Summary

The ANNOTATE system provides:

Automatic Test Case Generation - From annotations
Production-Ready Training - Integrates with orchestrator
Real-Time Inference - Live predictions on streaming data
Data Consistency - Same data as main system
Easy Monitoring - Real-time progress and signals

You can now:

  1. Mark profitable trades
  2. Generate training data automatically
  3. Train models with your annotations
  4. Test models with real-time inference
  5. Monitor model performance live

Happy Training!