popov/gogo2

Fork 0

Files

Dobromir Popov 002d0f7858 save/load data anotations

2025-10-18 23:44:02 +03:00

8.6 KiB

Raw Blame History

ANNOTATE - Model Training & Inference Guide

🎯 Overview

This guide covers how to use the ANNOTATE system for:

Generating Training Data - From manual annotations
Training Models - Using annotated test cases
Real-Time Inference - Live model predictions with streaming data

📦 Test Case Generation

Automatic Generation

When you save an annotation, a test case is automatically generated and saved to disk.

Location: ANNOTATE/data/test_cases/annotation_<id>.json

What's Included

Each test case contains:

✅ Market State - OHLCV data for all 4 timeframes (100 candles each)
✅ Entry/Exit Prices - Exact prices from annotation
✅ Expected Outcome - Direction (LONG/SHORT) and P&L percentage
✅ Timestamp - When the trade occurred
✅ Action - BUY or SELL signal

Test Case Format

{
  "test_case_id": "annotation_uuid",
  "symbol": "ETH/USDT",
  "timestamp": "2024-01-15T10:30:00Z",
  "action": "BUY",
  "market_state": {
    "ohlcv_1s": {
      "timestamps": [...],  // 100 candles
      "open": [...],
      "high": [...],
      "low": [...],
      "close": [...],
      "volume": [...]
    },
    "ohlcv_1m": {...},  // 100 candles
    "ohlcv_1h": {...},  // 100 candles
    "ohlcv_1d": {...}   // 100 candles
  },
  "expected_outcome": {
    "direction": "LONG",
    "profit_loss_pct": 2.5,
    "entry_price": 2400.50,
    "exit_price": 2460.75,
    "holding_period_seconds": 300
  }
}

🎓 Model Training

Available Models

The system integrates with your existing models:

StandardizedCNN - CNN model for pattern recognition
DQN - Deep Q-Network for reinforcement learning
Transformer - Transformer model for sequence analysis
COB - Order book-based RL model

Training Process

Step 1: Create Annotations

Mark profitable trades on historical data
Test cases are auto-generated and saved
Verify test cases exist in ANNOTATE/data/test_cases/

Step 2: Select Model

Open training panel (right sidebar)
Select model from dropdown
Available models are loaded from orchestrator

Step 3: Start Training

Click "Train Model" button
System loads all test cases from disk
Training starts in background thread
Progress displayed in real-time

Step 4: Monitor Progress

Current Epoch - Shows training progress
Loss - Training loss value
Status - Running/Completed/Failed

Training Details

What Happens During Training:

System loads all test cases from ANNOTATE/data/test_cases/
Prepares training data (market state → expected outcome)
Calls model's training method
Updates model weights based on annotations
Saves updated model checkpoint

Training Parameters:

Epochs: 10 (configurable)
Batch Size: Depends on model
Learning Rate: Model-specific
Data: All available test cases

🚀 Real-Time Inference

Overview

Real-time inference mode runs your trained model on live streaming data from the DataProvider, generating predictions in real-time.

Starting Real-Time Inference

Step 1: Select Model

Choose the model you want to run inference with.

Step 2: Start Inference

Click "Start Live Inference" button
System loads model from orchestrator
Connects to DataProvider's live data stream
Begins generating predictions every second

Step 3: Monitor Signals

Latest Signal - BUY/SELL/HOLD
Confidence - Model confidence (0-100%)
Price - Current market price
Timestamp - When signal was generated

How It Works

DataProvider (Live Data)
    ↓
Latest Market State (4 timeframes)
    ↓
Model Inference
    ↓
Prediction (Action + Confidence)
    ↓
Display on UI + Chart Markers

Signal Display

Signals appear in training panel
Latest 50 signals stored
Can be displayed on charts (future feature)
Updates every second

Stopping Inference

Click "Stop Inference" button
Inference loop terminates
Final signals remain visible

🔧 Integration with Orchestrator

Model Loading

Models are loaded directly from the orchestrator:

# CNN Model
model = orchestrator.cnn_model

# DQN Agent
model = orchestrator.rl_agent

# Transformer
model = orchestrator.primary_transformer

# COB RL
model = orchestrator.cob_rl_agent

Data Consistency

Uses same DataProvider as main system
Same cached data
Same data structure
Perfect consistency

📊 Training Workflow Example

Scenario: Train CNN on Breakout Patterns

Step 1: Annotate Trades

1. Find 10 clear breakout patterns
2. Mark entry/exit for each
3. Test cases auto-generated
4. Result: 10 test cases in ANNOTATE/data/test_cases/

Step 2: Train Model

1. Select "StandardizedCNN" from dropdown
2. Click "Train Model"
3. System loads 10 test cases
4. Training runs for 10 epochs
5. Model learns breakout patterns

Step 3: Test with Real-Time Inference

1. Click "Start Live Inference"
2. Model analyzes live data
3. Generates BUY signals on breakouts
4. Monitor confidence levels
5. Verify model learned correctly

🎯 Best Practices

For Training

1. Quality Over Quantity

Start with 10-20 high-quality annotations
Focus on clear, obvious patterns
Verify each annotation is correct

2. Diverse Scenarios

Include different market conditions
Mix LONG and SHORT trades
Various timeframes and volatility levels

3. Incremental Training

Train with small batches first
Verify model learns correctly
Add more annotations gradually

4. Test After Training

Use real-time inference to verify
Check if model recognizes patterns
Adjust annotations if needed

For Real-Time Inference

1. Monitor Confidence

High confidence (>70%) = Strong signal
Medium confidence (50-70%) = Moderate signal
Low confidence (<50%) = Weak signal

2. Verify Against Charts

Check if signals make sense
Compare with your own analysis
Look for false positives

3. Track Performance

Note which signals were correct
Identify patterns in errors
Use insights to improve annotations

🔍 Troubleshooting

Training Issues

Issue: "No test cases found"

Solution: Create annotations first, test cases are auto-generated

Issue: Training fails immediately

Solution: Check model is loaded in orchestrator, verify test case format

Issue: Loss not decreasing

Solution: May need more/better quality annotations, check data quality

Inference Issues

Issue: No signals generated

Solution: Verify DataProvider has live data, check model is loaded

Issue: All signals are HOLD

Solution: Model may need more training, check confidence levels

Issue: Signals don't match expectations

Solution: Review training data, may need different annotations

📈 Performance Metrics

Training Metrics

Loss - Lower is better (target: <0.1)
Accuracy - Higher is better (target: >80%)
Epochs - More epochs = more learning
Duration - Training time in seconds

Inference Metrics

Latency - Time to generate prediction (~1s)
Confidence - Model certainty (0-100%)
Signal Rate - Predictions per minute
Accuracy - Correct predictions vs total

🚀 Advanced Usage

Custom Training Parameters

Edit ANNOTATE/core/training_simulator.py:

'total_epochs': 10,  # Increase for more training

Model-Specific Training

Each model type has its own training method:

_train_cnn() - For CNN models
_train_dqn() - For DQN agents
_train_transformer() - For Transformers
_train_cob() - For COB models

Batch Training

Train on specific annotations:

# In future: Select specific annotations for training
annotation_ids = ['id1', 'id2', 'id3']

📝 File Locations

Test Cases

ANNOTATE/data/test_cases/annotation_<id>.json

Training Results

ANNOTATE/data/training_results/

Model Checkpoints

models/checkpoints/  (main system)

🎊 Summary

The ANNOTATE system provides:

✅ Automatic Test Case Generation - From annotations
✅ Production-Ready Training - Integrates with orchestrator
✅ Real-Time Inference - Live predictions on streaming data
✅ Data Consistency - Same data as main system
✅ Easy Monitoring - Real-time progress and signals

You can now:

Mark profitable trades
Generate training data automatically
Train models with your annotations
Test models with real-time inference
Monitor model performance live

Happy Training! 🚀

8.6 KiB Raw Blame History