# Transformer Model Input/Output Structure ## FIXED ISSUE: Batch Data Deletion Bug **Problem**: Training was failing after epoch 1 with "At least one timeframe must be provided" **Root Cause**: Batch tensors were being deleted after each use in the training loop, but the same batch dictionaries were being reused across all epochs. **Solution**: Removed batch deletion from inside the epoch loop and moved cleanup to after all epochs complete. ## Current Model Architecture ### INPUT Structure (Multi-Timeframe) The model accepts the following inputs in the `forward()` method: ```python forward( # Price data for different timeframes - [batch, seq_len, 5] OHLCV price_data_1s=None, # 1-second timeframe price_data_1m=None, # 1-minute timeframe price_data_1h=None, # 1-hour timeframe price_data_1d=None, # 1-day timeframe # Reference data btc_data_1m=None, # BTC reference - [batch, seq_len, 5] # Additional features cob_data=None, # COB orderbook data - [batch, seq_len, 100] tech_data=None, # Technical indicators - [batch, 40] market_data=None, # Market context (pivots, volume) - [batch, 30] position_state=None, # Current position state - [batch, 5] # Legacy support price_data=None # Fallback to single timeframe ) ``` **At least one timeframe** (price_data_1s, 1m, 1h, or 1d) must be provided, otherwise the model raises: ``` ValueError: At least one timeframe must be provided ``` ### OUTPUT Structure The model returns a dictionary with the following predictions: ```python outputs = { # PRIMARY OUTPUTS (trained with loss): 'action_logits': tensor, # [batch, 3] - BUY/SELL/HOLD logits 'action_probs': tensor, # [batch, 3] - softmax probabilities 'price_prediction': tensor, # [batch, 1] - next price change ratio 'confidence': tensor, # [batch, 1] - prediction confidence # TREND ANALYSIS (trained with loss): 'trend_analysis': { 'angle_radians': tensor, # [batch, 1] - trend angle in radians 'steepness': tensor, # [batch, 1] - trend steepness (0-1) 'direction': tensor # [batch, 1] - direction (-1/0/+1) }, # NEXT CANDLE PREDICTIONS (evaluated but NOT trained): 'next_candles': { '1s': tensor, # [batch, 5] - predicted OHLCV for 1s '1m': tensor, # [batch, 5] - predicted OHLCV for 1m '1h': tensor, # [batch, 5] - predicted OHLCV for 1h '1d': tensor, # [batch, 5] - predicted OHLCV for 1d }, 'btc_next_candle': tensor, # [batch, 5] - predicted BTC OHLCV # PIVOT PREDICTIONS: 'next_pivots': { 'L1': { 'price': tensor, # [batch, 1] - pivot price 'type_prob_high': tensor, # [batch, 1] - probability of high 'type_prob_low': tensor, # [batch, 1] - probability of low 'pivot_type': tensor, # [batch, 1] - 0=high, 1=low 'confidence': tensor # [batch, 1] - confidence }, # Same structure for L2, L3, L4, L5 }, # AUXILIARY OUTPUTS: 'volatility_prediction': tensor, # [batch, 1] 'trend_strength_prediction': tensor, # [batch, 1] 'uncertainty_mean': tensor, # [batch, 1] 'uncertainty_std': tensor # [batch, 1] } ``` ### TRAINING TARGETS (in batch) ```python batch = { # Input features (see INPUT Structure above) 'price_data_1s': tensor, 'price_data_1m': tensor, 'price_data_1h': tensor, 'price_data_1d': tensor, 'btc_data_1m': tensor, 'cob_data': tensor, 'tech_data': tensor, 'market_data': tensor, 'position_state': tensor, # Training targets: 'actions': tensor, # [batch] - target action (0/1/2) 'future_prices': tensor, # [batch, 1] - actual price change ratio 'trade_success': tensor, # [batch, 1] - 1.0 if profitable 'trend_target': tensor, # [batch, 3] - [angle, steepness, direction] } ``` ### LOSS CALCULATION Current loss function in `train_step()`: ```python total_loss = action_loss + 0.1 * price_loss + 0.05 * trend_loss where: - action_loss: CrossEntropyLoss(action_logits, actions) - price_loss: MSELoss(price_prediction, future_prices) - trend_loss: MSELoss(trend_pred, trend_target) ``` **NOTE**: Next candle predictions are currently only used for accuracy evaluation, NOT trained directly. ## CURRENT ISSUES AND RECOMMENDATIONS ### Issue 1: Next Candle Predictions Not Trained **Status**: The model outputs next candle predictions for each timeframe, but these are NOT included in the loss function. **Impact**: The model is not explicitly learning to predict next candle OHLCV values. **Recommendation**: Add next candle loss to training: ```python # Calculate next candle loss for each available timeframe candle_loss = 0.0 if 'next_candles' in outputs: for tf in ['1s', '1m', '1h', '1d']: if tf in outputs['next_candles'] and f'future_candle_{tf}' in batch: pred_candle = outputs['next_candles'][tf] # [batch, 5] target_candle = batch[f'future_candle_{tf}'] # [batch, 5] candle_loss += MSELoss(pred_candle, target_candle) total_loss = action_loss + 0.1 * price_loss + 0.05 * trend_loss + 0.1 * candle_loss ``` ### Issue 2: Annotation Timeframe vs Prediction Timeframe **Current Behavior**: - Annotations are created at a specific point in time - The model receives multiple timeframes (1s, 1m, 1h, 1d) as input - Predictions are made for ALL timeframes simultaneously - Only the 1m timeframe prediction is currently evaluated for accuracy **Question**: Should predictions be specific to the annotation's timeframe? **Options**: 1. **Multi-timeframe predictions (current)**: Keep predicting all timeframes, add loss for each 2. **Annotation-specific predictions**: Only predict/train on the timeframe that matches the annotation 3. **Weighted predictions**: Weight the loss by the annotation's timeframe (e.g., if annotated on 1m, weight 1m prediction higher) ### Issue 3: Missing Target Data for Next Candles **Current**: The batch only contains `future_prices` (next close price change) **Needed**: To train next candle predictions, we need full OHLCV targets: - `future_candle_1s`: [batch, 5] - next 1s candle OHLCV - `future_candle_1m`: [batch, 5] - next 1m candle OHLCV - `future_candle_1h`: [batch, 5] - next 1h candle OHLCV - `future_candle_1d`: [batch, 5] - next 1d candle OHLCV **Location to add**: `ANNOTATE/core/real_training_adapter.py` in `_convert_annotation_to_transformer_batch()` ## SUMMARY ✅ **Fixed**: Batch deletion bug causing epoch 2+ failures ✅ **Working**: Model can predict next candles for all timeframes ✅ **Working**: Model can predict trend vector (angle, steepness, direction) ❌ **Missing**: Loss calculation for next candle predictions ❌ **Missing**: Target data (future OHLCV) for next candle training ⚠️ **Unclear**: Should predictions be timeframe-specific or multi-timeframe? ## NEXT STEPS 1. **Add future OHLCV target data** to training batches 2. **Add next candle loss** to the training loop 3. **Clarify prediction strategy**: Single timeframe vs multi-timeframe 4. **Test training** with enhanced loss function