gogo2/docs/CANDLE_TA_FEATURES_REFERENCE.md

# Candle TA Features Quick Reference

## Overview

Enhanced technical analysis features for `OHLCVBar` class providing comprehensive candle pattern recognition, relative sizing, and body/wick analysis.

**Location**: `core/data_models.py` - `OHLCVBar` class

---

## Quick Start

```python
from core.data_models import OHLCVBar, BaseDataInput
from datetime import datetime

# Create a candle
bar = OHLCVBar(
    symbol='ETH/USDT',
    timestamp=datetime.now(),
    open=2000.0,
    high=2050.0,
    low=1990.0,
    close=2040.0,
    volume=1000.0,
    timeframe='1m'
)

# Check basic properties
print(f"Bullish: {bar.is_bullish}")           # True
print(f"Body size: {bar.body_size}")          # 40.0
print(f"Pattern: {bar.get_candle_pattern()}") # 'standard'

# Get all TA features
reference_bars = [...]  # Previous 10 candles
ta_features = bar.get_ta_features(reference_bars)
print(f"Features: {len(ta_features)}")        # 22 features
```

---

## Properties (Computed On-Demand)

### Basic Measurements

| Property | Type | Description | Example |
|----------|------|-------------|---------|
| `body_size` | float | Absolute size of candle body | `abs(close - open)` |
| `upper_wick` | float | Size of upper shadow | `high - max(open, close)` |
| `lower_wick` | float | Size of lower shadow | `min(open, close) - low` |
| `total_range` | float | Total high-low range | `high - low` |

### Candle Type

| Property | Type | Description |
|----------|------|-------------|
| `is_bullish` | bool | True if close > open (hollow/green) |
| `is_bearish` | bool | True if close < open (solid/red) |
| `is_doji` | bool | True if body < 10% of total range |

---

## Methods

### 1. Ratio Calculations

#### `get_body_to_range_ratio() -> float`
Returns body size as percentage of total range (0.0 to 1.0)

```python
ratio = bar.get_body_to_range_ratio()
# 0.0 = doji (no body)
# 0.5 = body is half the range
# 1.0 = marubozu (all body, no wicks)
```

#### `get_upper_wick_ratio() -> float`
Returns upper wick as percentage of total range (0.0 to 1.0)

```python
ratio = bar.get_upper_wick_ratio()
# 0.0 = no upper wick
# 0.5 = upper wick is half the range
# 1.0 = all upper wick (impossible in practice)
```

#### `get_lower_wick_ratio() -> float`
Returns lower wick as percentage of total range (0.0 to 1.0)

```python
ratio = bar.get_lower_wick_ratio()
# 0.0 = no lower wick
# 0.5 = lower wick is half the range
```

---

### 2. Relative Sizing

#### `get_relative_size(reference_bars, method='avg') -> float`

Compare current candle size to reference candles.

**Parameters:**
- `reference_bars`: List of previous OHLCVBar objects
- `method`: Comparison method
  - `'avg'`: Compare to average (default)
  - `'max'`: Compare to maximum
  - `'median'`: Compare to median

**Returns:**
- `1.0` = Same size as reference
- `> 1.0` = Larger than reference
- `< 1.0` = Smaller than reference

**Example:**
```python
# Get last 10 candles
recent = ohlcv_list[-10:]
current = ohlcv_list[-1]

# Compare to average
avg_ratio = current.get_relative_size(recent[:-1], 'avg')
if avg_ratio > 2.0:
    print("Current candle is 2x larger than average!")

# Compare to maximum
max_ratio = current.get_relative_size(recent[:-1], 'max')
if max_ratio > 1.0:
    print("Current candle is the largest!")
```

---

### 3. Pattern Recognition

#### `get_candle_pattern() -> str`

Identify basic candle pattern.

**Patterns Detected:**

| Pattern | Criteria | Interpretation |
|---------|----------|----------------|
| `'doji'` | Body < 10% of range | Indecision, potential reversal |
| `'hammer'` | Small body at top, long lower wick | Bullish reversal signal |
| `'shooting_star'` | Small body at bottom, long upper wick | Bearish reversal signal |
| `'spinning_top'` | Small body, both wicks present | Indecision |
| `'marubozu_bullish'` | Large bullish body (>90% of range) | Strong bullish momentum |
| `'marubozu_bearish'` | Large bearish body (>90% of range) | Strong bearish momentum |
| `'standard'` | Regular candle | Normal price action |

**Example:**
```python
pattern = bar.get_candle_pattern()

if pattern == 'hammer':
    print("Potential bullish reversal!")
elif pattern == 'shooting_star':
    print("Potential bearish reversal!")
elif pattern == 'doji':
    print("Market indecision")
```

**Pattern Criteria Details:**

```python
# Doji
body_ratio < 0.1

# Marubozu
body_ratio > 0.9

# Hammer
body_ratio < 0.3 and lower_ratio > 0.6 and upper_ratio < 0.1

# Shooting Star
body_ratio < 0.3 and upper_ratio > 0.6 and lower_ratio < 0.1

# Spinning Top
body_ratio < 0.3 and (upper_ratio + lower_ratio) > 0.6
```

---

### 4. Complete TA Feature Set

#### `get_ta_features(reference_bars=None) -> Dict[str, float]`

Get all technical analysis features as a dictionary.

**Parameters:**
- `reference_bars`: Optional list of previous bars for relative sizing

**Returns:** Dictionary with 22 features (or 12 without reference_bars)

**Feature Categories:**

#### Basic Properties (3 features)
```python
{
    'is_bullish': 1.0 or 0.0,
    'is_bearish': 1.0 or 0.0,
    'is_doji': 1.0 or 0.0,
}
```

#### Size Ratios (3 features)
```python
{
    'body_to_range_ratio': 0.0 to 1.0,
    'upper_wick_ratio': 0.0 to 1.0,
    'lower_wick_ratio': 0.0 to 1.0,
}
```

#### Normalized Sizes (4 features)
```python
{
    'body_size_pct': body_size / close,
    'upper_wick_pct': upper_wick / close,
    'lower_wick_pct': lower_wick / close,
    'total_range_pct': total_range / close,
}
```

#### Volume Analysis (1 feature)
```python
{
    'volume_per_range': volume / total_range,
}
```

#### Relative Sizing (3 features - if reference_bars provided)
```python
{
    'relative_size_avg': ratio vs average,
    'relative_size_max': ratio vs maximum,
    'relative_size_median': ratio vs median,
}
```

#### Pattern Encoding (7 features - one-hot)
```python
{
    'pattern_doji': 1.0 or 0.0,
    'pattern_hammer': 1.0 or 0.0,
    'pattern_shooting_star': 1.0 or 0.0,
    'pattern_spinning_top': 1.0 or 0.0,
    'pattern_marubozu_bullish': 1.0 or 0.0,
    'pattern_marubozu_bearish': 1.0 or 0.0,
    'pattern_standard': 1.0 or 0.0,
}
```

**Example:**
```python
# Get complete feature set
reference_bars = ohlcv_list[-10:-1]
current_bar = ohlcv_list[-1]

ta_features = current_bar.get_ta_features(reference_bars)

# Access specific features
if ta_features['pattern_hammer'] == 1.0:
    print("Hammer pattern detected!")

if ta_features['relative_size_avg'] > 2.0:
    print("Unusually large candle!")

if ta_features['body_to_range_ratio'] < 0.1:
    print("Doji-like candle (small body)")
```

---

## Integration with BaseDataInput

### Standard Mode (7,850 features)

```python
base_data = data_provider.build_base_data_input('ETH/USDT')
features = base_data.get_feature_vector(include_candle_ta=False)
# Returns: 7,850 features (backward compatible)
```

### Enhanced Mode (22,850 features)

```python
base_data = data_provider.build_base_data_input('ETH/USDT')
features = base_data.get_feature_vector(include_candle_ta=True)
# Returns: 22,850 features (includes 10 TA features per candle)
```

**10 TA Features Per Candle:**
1. `is_bullish`
2. `body_to_range_ratio`
3. `upper_wick_ratio`
4. `lower_wick_ratio`
5. `body_size_pct`
6. `total_range_pct`
7. `relative_size_avg`
8. `pattern_doji`
9. `pattern_hammer`
10. `pattern_shooting_star`

**Total Addition:**
- ETH: 300 frames × 4 timeframes × 10 features = 12,000 features
- BTC: 300 frames × 10 features = 3,000 features
- **Total**: 15,000 additional features

---

## Common Use Cases

### 1. Detect Reversal Patterns

```python
def scan_for_reversals(ohlcv_list: List[OHLCVBar]) -> List[tuple]:
    """Scan for potential reversal patterns"""
    reversals = []

    for i, bar in enumerate(ohlcv_list[-50:]):
        pattern = bar.get_candle_pattern()

        if pattern in ['hammer', 'shooting_star']:
            reversals.append((i, bar.timestamp, pattern, bar.close))

    return reversals

# Usage
reversals = scan_for_reversals(base_data.ohlcv_1m)
for idx, timestamp, pattern, price in reversals:
    print(f"{timestamp}: {pattern} at ${price:.2f}")
```

### 2. Identify Momentum Candles

```python
def find_momentum_candles(ohlcv_list: List[OHLCVBar],
                          threshold: float = 2.0) -> List[OHLCVBar]:
    """Find unusually large candles indicating momentum"""
    momentum_candles = []

    for i in range(10, len(ohlcv_list)):
        current = ohlcv_list[i]
        reference = ohlcv_list[i-10:i]

        relative_size = current.get_relative_size(reference, 'avg')

        if relative_size > threshold:
            momentum_candles.append(current)

    return momentum_candles

# Usage
momentum = find_momentum_candles(base_data.ohlcv_1m, threshold=2.5)
print(f"Found {len(momentum)} momentum candles")
```

### 3. Analyze Candle Structure

```python
def analyze_candle_structure(bar: OHLCVBar) -> Dict[str, Any]:
    """Comprehensive candle analysis"""
    return {
        'direction': 'bullish' if bar.is_bullish else 'bearish',
        'pattern': bar.get_candle_pattern(),
        'body_dominance': bar.get_body_to_range_ratio(),
        'upper_wick_dominance': bar.get_upper_wick_ratio(),
        'lower_wick_dominance': bar.get_lower_wick_ratio(),
        'interpretation': _interpret_structure(bar)
    }

def _interpret_structure(bar: OHLCVBar) -> str:
    """Interpret candle structure"""
    body_ratio = bar.get_body_to_range_ratio()

    if body_ratio > 0.8:
        return "Strong momentum"
    elif body_ratio < 0.2:
        return "Indecision/consolidation"
    elif bar.get_upper_wick_ratio() > 0.5:
        return "Rejection at highs"
    elif bar.get_lower_wick_ratio() > 0.5:
        return "Support at lows"
    else:
        return "Normal price action"

# Usage
current_bar = base_data.ohlcv_1m[-1]
analysis = analyze_candle_structure(current_bar)
print(f"Pattern: {analysis['pattern']}")
print(f"Interpretation: {analysis['interpretation']}")
```

### 4. Build Custom Features

```python
def extract_custom_candle_features(ohlcv_list: List[OHLCVBar],
                                   window: int = 10) -> np.ndarray:
    """Extract custom candle features for ML model"""
    features = []

    for i in range(window, len(ohlcv_list)):
        current = ohlcv_list[i]
        reference = ohlcv_list[i-window:i]

        # Get TA features
        ta = current.get_ta_features(reference)

        # Custom feature engineering
        features.append([
            ta['is_bullish'],
            ta['body_to_range_ratio'],
            ta['relative_size_avg'],
            ta['pattern_doji'],
            ta['pattern_hammer'],
            ta['pattern_shooting_star'],
            # Add more as needed
        ])

    return np.array(features)

# Usage
custom_features = extract_custom_candle_features(base_data.ohlcv_1m)
print(f"Custom features shape: {custom_features.shape}")
```

---

## Performance Considerations

### Computation Time

| Operation | Time | Notes |
|-----------|------|-------|
| Property access (cached) | ~0.001 ms | Very fast |
| `get_candle_pattern()` | ~0.01 ms | Fast |
| `get_ta_features()` | ~0.1 ms | Moderate |
| Full feature vector (1500 candles) | ~150 ms | Can be optimized |

### Optimization Tips

#### 1. Cache TA Features in OHLCVBar

```python
# When creating OHLCVBar, pre-compute TA features
bar = OHLCVBar(...)
ta_features = bar.get_ta_features(reference_bars)
bar.indicators.update(ta_features)  # Cache in indicators dict
```

#### 2. Batch Processing

```python
# Process all candles at once
def precompute_ta_features(ohlcv_list: List[OHLCVBar]):
    """Pre-compute TA features for all candles"""
    for i in range(10, len(ohlcv_list)):
        current = ohlcv_list[i]
        reference = ohlcv_list[i-10:i]
        ta = current.get_ta_features(reference)
        current.indicators.update(ta)
```

#### 3. Lazy Evaluation

```python
# Only compute when needed
if model.requires_candle_ta:
    features = base_data.get_feature_vector(include_candle_ta=True)
else:
    features = base_data.get_feature_vector(include_candle_ta=False)
```

---

## Testing

### Unit Tests

```python
def test_candle_properties():
    bar = OHLCVBar('ETH/USDT', datetime.now(), 2000, 2050, 1990, 2040, 1000, '1m')
    assert bar.is_bullish == True
    assert bar.body_size == 40.0
    assert bar.total_range == 60.0

def test_pattern_recognition():
    doji = OHLCVBar('ETH/USDT', datetime.now(), 2000, 2005, 1995, 2001, 100, '1m')
    assert doji.get_candle_pattern() == 'doji'

def test_relative_sizing():
    bars = [OHLCVBar('ETH/USDT', datetime.now(), 2000, 2010, 1990, 2005, 100, '1m') for _ in range(10)]
    large = OHLCVBar('ETH/USDT', datetime.now(), 2000, 2060, 1980, 2055, 100, '1m')
    assert large.get_relative_size(bars, 'avg') > 2.0
```

---

## Troubleshooting

### Issue: TA features all zeros

**Cause**: No reference bars provided to `get_ta_features()`

**Solution**:
```python
# Provide reference bars
reference_bars = ohlcv_list[-10:-1]
ta_features = current_bar.get_ta_features(reference_bars)
```

### Issue: Pattern always 'standard'

**Cause**: Candle doesn't meet specific pattern criteria

**Solution**: Check ratios manually
```python
print(f"Body ratio: {bar.get_body_to_range_ratio()}")
print(f"Upper wick: {bar.get_upper_wick_ratio()}")
print(f"Lower wick: {bar.get_lower_wick_ratio()}")
```

### Issue: Slow feature extraction

**Cause**: Computing TA features for many candles

**Solution**: Pre-compute and cache
```python
# Cache in data provider
for bar in ohlcv_list:
    if 'ta_cached' not in bar.indicators:
        ta = bar.get_ta_features(reference_bars)
        bar.indicators.update(ta)
        bar.indicators['ta_cached'] = True
```

---

## References

- **Implementation**: `core/data_models.py` - `OHLCVBar` class
- **Usage Guide**: `docs/BASE_DATA_INPUT_USAGE_AUDIT.md`
- **Specification**: `docs/BASE_DATA_INPUT_SPECIFICATION.md`
- **Integration**: `core/standardized_data_provider.py`