raining normalization fix
This commit is contained in:
186
docs/main/_MODEL_INPUT_OUTPUT_STRUCTURE.md
Normal file
186
docs/main/_MODEL_INPUT_OUTPUT_STRUCTURE.md
Normal file
@@ -0,0 +1,186 @@
|
||||
# Transformer Model Input/Output Structure
|
||||
|
||||
## FIXED ISSUE: Batch Data Deletion Bug
|
||||
**Problem**: Training was failing after epoch 1 with "At least one timeframe must be provided"
|
||||
**Root Cause**: Batch tensors were being deleted after each use in the training loop, but the same batch dictionaries were being reused across all epochs.
|
||||
**Solution**: Removed batch deletion from inside the epoch loop and moved cleanup to after all epochs complete.
|
||||
|
||||
## Current Model Architecture
|
||||
|
||||
### INPUT Structure (Multi-Timeframe)
|
||||
|
||||
The model accepts the following inputs in the `forward()` method:
|
||||
|
||||
```python
|
||||
forward(
|
||||
# Price data for different timeframes - [batch, seq_len, 5] OHLCV
|
||||
price_data_1s=None, # 1-second timeframe
|
||||
price_data_1m=None, # 1-minute timeframe
|
||||
price_data_1h=None, # 1-hour timeframe
|
||||
price_data_1d=None, # 1-day timeframe
|
||||
|
||||
# Reference data
|
||||
btc_data_1m=None, # BTC reference - [batch, seq_len, 5]
|
||||
|
||||
# Additional features
|
||||
cob_data=None, # COB orderbook data - [batch, seq_len, 100]
|
||||
tech_data=None, # Technical indicators - [batch, 40]
|
||||
market_data=None, # Market context (pivots, volume) - [batch, 30]
|
||||
position_state=None, # Current position state - [batch, 5]
|
||||
|
||||
# Legacy support
|
||||
price_data=None # Fallback to single timeframe
|
||||
)
|
||||
```
|
||||
|
||||
**At least one timeframe** (price_data_1s, 1m, 1h, or 1d) must be provided, otherwise the model raises:
|
||||
```
|
||||
ValueError: At least one timeframe must be provided
|
||||
```
|
||||
|
||||
### OUTPUT Structure
|
||||
|
||||
The model returns a dictionary with the following predictions:
|
||||
|
||||
```python
|
||||
outputs = {
|
||||
# PRIMARY OUTPUTS (trained with loss):
|
||||
'action_logits': tensor, # [batch, 3] - BUY/SELL/HOLD logits
|
||||
'action_probs': tensor, # [batch, 3] - softmax probabilities
|
||||
'price_prediction': tensor, # [batch, 1] - next price change ratio
|
||||
'confidence': tensor, # [batch, 1] - prediction confidence
|
||||
|
||||
# TREND ANALYSIS (trained with loss):
|
||||
'trend_analysis': {
|
||||
'angle_radians': tensor, # [batch, 1] - trend angle in radians
|
||||
'steepness': tensor, # [batch, 1] - trend steepness (0-1)
|
||||
'direction': tensor # [batch, 1] - direction (-1/0/+1)
|
||||
},
|
||||
|
||||
# NEXT CANDLE PREDICTIONS (evaluated but NOT trained):
|
||||
'next_candles': {
|
||||
'1s': tensor, # [batch, 5] - predicted OHLCV for 1s
|
||||
'1m': tensor, # [batch, 5] - predicted OHLCV for 1m
|
||||
'1h': tensor, # [batch, 5] - predicted OHLCV for 1h
|
||||
'1d': tensor, # [batch, 5] - predicted OHLCV for 1d
|
||||
},
|
||||
'btc_next_candle': tensor, # [batch, 5] - predicted BTC OHLCV
|
||||
|
||||
# PIVOT PREDICTIONS:
|
||||
'next_pivots': {
|
||||
'L1': {
|
||||
'price': tensor, # [batch, 1] - pivot price
|
||||
'type_prob_high': tensor, # [batch, 1] - probability of high
|
||||
'type_prob_low': tensor, # [batch, 1] - probability of low
|
||||
'pivot_type': tensor, # [batch, 1] - 0=high, 1=low
|
||||
'confidence': tensor # [batch, 1] - confidence
|
||||
},
|
||||
# Same structure for L2, L3, L4, L5
|
||||
},
|
||||
|
||||
# AUXILIARY OUTPUTS:
|
||||
'volatility_prediction': tensor, # [batch, 1]
|
||||
'trend_strength_prediction': tensor, # [batch, 1]
|
||||
'uncertainty_mean': tensor, # [batch, 1]
|
||||
'uncertainty_std': tensor # [batch, 1]
|
||||
}
|
||||
```
|
||||
|
||||
### TRAINING TARGETS (in batch)
|
||||
|
||||
```python
|
||||
batch = {
|
||||
# Input features (see INPUT Structure above)
|
||||
'price_data_1s': tensor,
|
||||
'price_data_1m': tensor,
|
||||
'price_data_1h': tensor,
|
||||
'price_data_1d': tensor,
|
||||
'btc_data_1m': tensor,
|
||||
'cob_data': tensor,
|
||||
'tech_data': tensor,
|
||||
'market_data': tensor,
|
||||
'position_state': tensor,
|
||||
|
||||
# Training targets:
|
||||
'actions': tensor, # [batch] - target action (0/1/2)
|
||||
'future_prices': tensor, # [batch, 1] - actual price change ratio
|
||||
'trade_success': tensor, # [batch, 1] - 1.0 if profitable
|
||||
'trend_target': tensor, # [batch, 3] - [angle, steepness, direction]
|
||||
}
|
||||
```
|
||||
|
||||
### LOSS CALCULATION
|
||||
|
||||
Current loss function in `train_step()`:
|
||||
|
||||
```python
|
||||
total_loss = action_loss + 0.1 * price_loss + 0.05 * trend_loss
|
||||
|
||||
where:
|
||||
- action_loss: CrossEntropyLoss(action_logits, actions)
|
||||
- price_loss: MSELoss(price_prediction, future_prices)
|
||||
- trend_loss: MSELoss(trend_pred, trend_target)
|
||||
```
|
||||
|
||||
**NOTE**: Next candle predictions are currently only used for accuracy evaluation, NOT trained directly.
|
||||
|
||||
## CURRENT ISSUES AND RECOMMENDATIONS
|
||||
|
||||
### Issue 1: Next Candle Predictions Not Trained
|
||||
**Status**: The model outputs next candle predictions for each timeframe, but these are NOT included in the loss function.
|
||||
**Impact**: The model is not explicitly learning to predict next candle OHLCV values.
|
||||
|
||||
**Recommendation**: Add next candle loss to training:
|
||||
```python
|
||||
# Calculate next candle loss for each available timeframe
|
||||
candle_loss = 0.0
|
||||
if 'next_candles' in outputs:
|
||||
for tf in ['1s', '1m', '1h', '1d']:
|
||||
if tf in outputs['next_candles'] and f'future_candle_{tf}' in batch:
|
||||
pred_candle = outputs['next_candles'][tf] # [batch, 5]
|
||||
target_candle = batch[f'future_candle_{tf}'] # [batch, 5]
|
||||
candle_loss += MSELoss(pred_candle, target_candle)
|
||||
|
||||
total_loss = action_loss + 0.1 * price_loss + 0.05 * trend_loss + 0.1 * candle_loss
|
||||
```
|
||||
|
||||
### Issue 2: Annotation Timeframe vs Prediction Timeframe
|
||||
**Current Behavior**:
|
||||
- Annotations are created at a specific point in time
|
||||
- The model receives multiple timeframes (1s, 1m, 1h, 1d) as input
|
||||
- Predictions are made for ALL timeframes simultaneously
|
||||
- Only the 1m timeframe prediction is currently evaluated for accuracy
|
||||
|
||||
**Question**: Should predictions be specific to the annotation's timeframe?
|
||||
|
||||
**Options**:
|
||||
1. **Multi-timeframe predictions (current)**: Keep predicting all timeframes, add loss for each
|
||||
2. **Annotation-specific predictions**: Only predict/train on the timeframe that matches the annotation
|
||||
3. **Weighted predictions**: Weight the loss by the annotation's timeframe (e.g., if annotated on 1m, weight 1m prediction higher)
|
||||
|
||||
### Issue 3: Missing Target Data for Next Candles
|
||||
**Current**: The batch only contains `future_prices` (next close price change)
|
||||
**Needed**: To train next candle predictions, we need full OHLCV targets:
|
||||
- `future_candle_1s`: [batch, 5] - next 1s candle OHLCV
|
||||
- `future_candle_1m`: [batch, 5] - next 1m candle OHLCV
|
||||
- `future_candle_1h`: [batch, 5] - next 1h candle OHLCV
|
||||
- `future_candle_1d`: [batch, 5] - next 1d candle OHLCV
|
||||
|
||||
**Location to add**: `ANNOTATE/core/real_training_adapter.py` in `_convert_annotation_to_transformer_batch()`
|
||||
|
||||
## SUMMARY
|
||||
|
||||
✅ **Fixed**: Batch deletion bug causing epoch 2+ failures
|
||||
✅ **Working**: Model can predict next candles for all timeframes
|
||||
✅ **Working**: Model can predict trend vector (angle, steepness, direction)
|
||||
❌ **Missing**: Loss calculation for next candle predictions
|
||||
❌ **Missing**: Target data (future OHLCV) for next candle training
|
||||
⚠️ **Unclear**: Should predictions be timeframe-specific or multi-timeframe?
|
||||
|
||||
## NEXT STEPS
|
||||
|
||||
1. **Add future OHLCV target data** to training batches
|
||||
2. **Add next candle loss** to the training loop
|
||||
3. **Clarify prediction strategy**: Single timeframe vs multi-timeframe
|
||||
4. **Test training** with enhanced loss function
|
||||
|
||||
Reference in New Issue
Block a user