Files
gogo2/docs/CANDLE_TA_FEATURES_REFERENCE.md
2025-10-31 00:44:08 +02:00

548 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Candle TA Features Quick Reference
## Overview
Enhanced technical analysis features for `OHLCVBar` class providing comprehensive candle pattern recognition, relative sizing, and body/wick analysis.
**Location**: `core/data_models.py` - `OHLCVBar` class
---
## Quick Start
```python
from core.data_models import OHLCVBar, BaseDataInput
from datetime import datetime
# Create a candle
bar = OHLCVBar(
symbol='ETH/USDT',
timestamp=datetime.now(),
open=2000.0,
high=2050.0,
low=1990.0,
close=2040.0,
volume=1000.0,
timeframe='1m'
)
# Check basic properties
print(f"Bullish: {bar.is_bullish}") # True
print(f"Body size: {bar.body_size}") # 40.0
print(f"Pattern: {bar.get_candle_pattern()}") # 'standard'
# Get all TA features
reference_bars = [...] # Previous 10 candles
ta_features = bar.get_ta_features(reference_bars)
print(f"Features: {len(ta_features)}") # 22 features
```
---
## Properties (Computed On-Demand)
### Basic Measurements
| Property | Type | Description | Example |
|----------|------|-------------|---------|
| `body_size` | float | Absolute size of candle body | `abs(close - open)` |
| `upper_wick` | float | Size of upper shadow | `high - max(open, close)` |
| `lower_wick` | float | Size of lower shadow | `min(open, close) - low` |
| `total_range` | float | Total high-low range | `high - low` |
### Candle Type
| Property | Type | Description |
|----------|------|-------------|
| `is_bullish` | bool | True if close > open (hollow/green) |
| `is_bearish` | bool | True if close < open (solid/red) |
| `is_doji` | bool | True if body < 10% of total range |
---
## Methods
### 1. Ratio Calculations
#### `get_body_to_range_ratio() -> float`
Returns body size as percentage of total range (0.0 to 1.0)
```python
ratio = bar.get_body_to_range_ratio()
# 0.0 = doji (no body)
# 0.5 = body is half the range
# 1.0 = marubozu (all body, no wicks)
```
#### `get_upper_wick_ratio() -> float`
Returns upper wick as percentage of total range (0.0 to 1.0)
```python
ratio = bar.get_upper_wick_ratio()
# 0.0 = no upper wick
# 0.5 = upper wick is half the range
# 1.0 = all upper wick (impossible in practice)
```
#### `get_lower_wick_ratio() -> float`
Returns lower wick as percentage of total range (0.0 to 1.0)
```python
ratio = bar.get_lower_wick_ratio()
# 0.0 = no lower wick
# 0.5 = lower wick is half the range
```
---
### 2. Relative Sizing
#### `get_relative_size(reference_bars, method='avg') -> float`
Compare current candle size to reference candles.
**Parameters:**
- `reference_bars`: List of previous OHLCVBar objects
- `method`: Comparison method
- `'avg'`: Compare to average (default)
- `'max'`: Compare to maximum
- `'median'`: Compare to median
**Returns:**
- `1.0` = Same size as reference
- `> 1.0` = Larger than reference
- `< 1.0` = Smaller than reference
**Example:**
```python
# Get last 10 candles
recent = ohlcv_list[-10:]
current = ohlcv_list[-1]
# Compare to average
avg_ratio = current.get_relative_size(recent[:-1], 'avg')
if avg_ratio > 2.0:
print("Current candle is 2x larger than average!")
# Compare to maximum
max_ratio = current.get_relative_size(recent[:-1], 'max')
if max_ratio > 1.0:
print("Current candle is the largest!")
```
---
### 3. Pattern Recognition
#### `get_candle_pattern() -> str`
Identify basic candle pattern.
**Patterns Detected:**
| Pattern | Criteria | Interpretation |
|---------|----------|----------------|
| `'doji'` | Body < 10% of range | Indecision, potential reversal |
| `'hammer'` | Small body at top, long lower wick | Bullish reversal signal |
| `'shooting_star'` | Small body at bottom, long upper wick | Bearish reversal signal |
| `'spinning_top'` | Small body, both wicks present | Indecision |
| `'marubozu_bullish'` | Large bullish body (>90% of range) | Strong bullish momentum |
| `'marubozu_bearish'` | Large bearish body (>90% of range) | Strong bearish momentum |
| `'standard'` | Regular candle | Normal price action |
**Example:**
```python
pattern = bar.get_candle_pattern()
if pattern == 'hammer':
print("Potential bullish reversal!")
elif pattern == 'shooting_star':
print("Potential bearish reversal!")
elif pattern == 'doji':
print("Market indecision")
```
**Pattern Criteria Details:**
```python
# Doji
body_ratio < 0.1
# Marubozu
body_ratio > 0.9
# Hammer
body_ratio < 0.3 and lower_ratio > 0.6 and upper_ratio < 0.1
# Shooting Star
body_ratio < 0.3 and upper_ratio > 0.6 and lower_ratio < 0.1
# Spinning Top
body_ratio < 0.3 and (upper_ratio + lower_ratio) > 0.6
```
---
### 4. Complete TA Feature Set
#### `get_ta_features(reference_bars=None) -> Dict[str, float]`
Get all technical analysis features as a dictionary.
**Parameters:**
- `reference_bars`: Optional list of previous bars for relative sizing
**Returns:** Dictionary with 22 features (or 12 without reference_bars)
**Feature Categories:**
#### Basic Properties (3 features)
```python
{
'is_bullish': 1.0 or 0.0,
'is_bearish': 1.0 or 0.0,
'is_doji': 1.0 or 0.0,
}
```
#### Size Ratios (3 features)
```python
{
'body_to_range_ratio': 0.0 to 1.0,
'upper_wick_ratio': 0.0 to 1.0,
'lower_wick_ratio': 0.0 to 1.0,
}
```
#### Normalized Sizes (4 features)
```python
{
'body_size_pct': body_size / close,
'upper_wick_pct': upper_wick / close,
'lower_wick_pct': lower_wick / close,
'total_range_pct': total_range / close,
}
```
#### Volume Analysis (1 feature)
```python
{
'volume_per_range': volume / total_range,
}
```
#### Relative Sizing (3 features - if reference_bars provided)
```python
{
'relative_size_avg': ratio vs average,
'relative_size_max': ratio vs maximum,
'relative_size_median': ratio vs median,
}
```
#### Pattern Encoding (7 features - one-hot)
```python
{
'pattern_doji': 1.0 or 0.0,
'pattern_hammer': 1.0 or 0.0,
'pattern_shooting_star': 1.0 or 0.0,
'pattern_spinning_top': 1.0 or 0.0,
'pattern_marubozu_bullish': 1.0 or 0.0,
'pattern_marubozu_bearish': 1.0 or 0.0,
'pattern_standard': 1.0 or 0.0,
}
```
**Example:**
```python
# Get complete feature set
reference_bars = ohlcv_list[-10:-1]
current_bar = ohlcv_list[-1]
ta_features = current_bar.get_ta_features(reference_bars)
# Access specific features
if ta_features['pattern_hammer'] == 1.0:
print("Hammer pattern detected!")
if ta_features['relative_size_avg'] > 2.0:
print("Unusually large candle!")
if ta_features['body_to_range_ratio'] < 0.1:
print("Doji-like candle (small body)")
```
---
## Integration with BaseDataInput
### Standard Mode (7,850 features)
```python
base_data = data_provider.build_base_data_input('ETH/USDT')
features = base_data.get_feature_vector(include_candle_ta=False)
# Returns: 7,850 features (backward compatible)
```
### Enhanced Mode (22,850 features)
```python
base_data = data_provider.build_base_data_input('ETH/USDT')
features = base_data.get_feature_vector(include_candle_ta=True)
# Returns: 22,850 features (includes 10 TA features per candle)
```
**10 TA Features Per Candle:**
1. `is_bullish`
2. `body_to_range_ratio`
3. `upper_wick_ratio`
4. `lower_wick_ratio`
5. `body_size_pct`
6. `total_range_pct`
7. `relative_size_avg`
8. `pattern_doji`
9. `pattern_hammer`
10. `pattern_shooting_star`
**Total Addition:**
- ETH: 300 frames × 4 timeframes × 10 features = 12,000 features
- BTC: 300 frames × 10 features = 3,000 features
- **Total**: 15,000 additional features
---
## Common Use Cases
### 1. Detect Reversal Patterns
```python
def scan_for_reversals(ohlcv_list: List[OHLCVBar]) -> List[tuple]:
"""Scan for potential reversal patterns"""
reversals = []
for i, bar in enumerate(ohlcv_list[-50:]):
pattern = bar.get_candle_pattern()
if pattern in ['hammer', 'shooting_star']:
reversals.append((i, bar.timestamp, pattern, bar.close))
return reversals
# Usage
reversals = scan_for_reversals(base_data.ohlcv_1m)
for idx, timestamp, pattern, price in reversals:
print(f"{timestamp}: {pattern} at ${price:.2f}")
```
### 2. Identify Momentum Candles
```python
def find_momentum_candles(ohlcv_list: List[OHLCVBar],
threshold: float = 2.0) -> List[OHLCVBar]:
"""Find unusually large candles indicating momentum"""
momentum_candles = []
for i in range(10, len(ohlcv_list)):
current = ohlcv_list[i]
reference = ohlcv_list[i-10:i]
relative_size = current.get_relative_size(reference, 'avg')
if relative_size > threshold:
momentum_candles.append(current)
return momentum_candles
# Usage
momentum = find_momentum_candles(base_data.ohlcv_1m, threshold=2.5)
print(f"Found {len(momentum)} momentum candles")
```
### 3. Analyze Candle Structure
```python
def analyze_candle_structure(bar: OHLCVBar) -> Dict[str, Any]:
"""Comprehensive candle analysis"""
return {
'direction': 'bullish' if bar.is_bullish else 'bearish',
'pattern': bar.get_candle_pattern(),
'body_dominance': bar.get_body_to_range_ratio(),
'upper_wick_dominance': bar.get_upper_wick_ratio(),
'lower_wick_dominance': bar.get_lower_wick_ratio(),
'interpretation': _interpret_structure(bar)
}
def _interpret_structure(bar: OHLCVBar) -> str:
"""Interpret candle structure"""
body_ratio = bar.get_body_to_range_ratio()
if body_ratio > 0.8:
return "Strong momentum"
elif body_ratio < 0.2:
return "Indecision/consolidation"
elif bar.get_upper_wick_ratio() > 0.5:
return "Rejection at highs"
elif bar.get_lower_wick_ratio() > 0.5:
return "Support at lows"
else:
return "Normal price action"
# Usage
current_bar = base_data.ohlcv_1m[-1]
analysis = analyze_candle_structure(current_bar)
print(f"Pattern: {analysis['pattern']}")
print(f"Interpretation: {analysis['interpretation']}")
```
### 4. Build Custom Features
```python
def extract_custom_candle_features(ohlcv_list: List[OHLCVBar],
window: int = 10) -> np.ndarray:
"""Extract custom candle features for ML model"""
features = []
for i in range(window, len(ohlcv_list)):
current = ohlcv_list[i]
reference = ohlcv_list[i-window:i]
# Get TA features
ta = current.get_ta_features(reference)
# Custom feature engineering
features.append([
ta['is_bullish'],
ta['body_to_range_ratio'],
ta['relative_size_avg'],
ta['pattern_doji'],
ta['pattern_hammer'],
ta['pattern_shooting_star'],
# Add more as needed
])
return np.array(features)
# Usage
custom_features = extract_custom_candle_features(base_data.ohlcv_1m)
print(f"Custom features shape: {custom_features.shape}")
```
---
## Performance Considerations
### Computation Time
| Operation | Time | Notes |
|-----------|------|-------|
| Property access (cached) | ~0.001 ms | Very fast |
| `get_candle_pattern()` | ~0.01 ms | Fast |
| `get_ta_features()` | ~0.1 ms | Moderate |
| Full feature vector (1500 candles) | ~150 ms | Can be optimized |
### Optimization Tips
#### 1. Cache TA Features in OHLCVBar
```python
# When creating OHLCVBar, pre-compute TA features
bar = OHLCVBar(...)
ta_features = bar.get_ta_features(reference_bars)
bar.indicators.update(ta_features) # Cache in indicators dict
```
#### 2. Batch Processing
```python
# Process all candles at once
def precompute_ta_features(ohlcv_list: List[OHLCVBar]):
"""Pre-compute TA features for all candles"""
for i in range(10, len(ohlcv_list)):
current = ohlcv_list[i]
reference = ohlcv_list[i-10:i]
ta = current.get_ta_features(reference)
current.indicators.update(ta)
```
#### 3. Lazy Evaluation
```python
# Only compute when needed
if model.requires_candle_ta:
features = base_data.get_feature_vector(include_candle_ta=True)
else:
features = base_data.get_feature_vector(include_candle_ta=False)
```
---
## Testing
### Unit Tests
```python
def test_candle_properties():
bar = OHLCVBar('ETH/USDT', datetime.now(), 2000, 2050, 1990, 2040, 1000, '1m')
assert bar.is_bullish == True
assert bar.body_size == 40.0
assert bar.total_range == 60.0
def test_pattern_recognition():
doji = OHLCVBar('ETH/USDT', datetime.now(), 2000, 2005, 1995, 2001, 100, '1m')
assert doji.get_candle_pattern() == 'doji'
def test_relative_sizing():
bars = [OHLCVBar('ETH/USDT', datetime.now(), 2000, 2010, 1990, 2005, 100, '1m') for _ in range(10)]
large = OHLCVBar('ETH/USDT', datetime.now(), 2000, 2060, 1980, 2055, 100, '1m')
assert large.get_relative_size(bars, 'avg') > 2.0
```
---
## Troubleshooting
### Issue: TA features all zeros
**Cause**: No reference bars provided to `get_ta_features()`
**Solution**:
```python
# Provide reference bars
reference_bars = ohlcv_list[-10:-1]
ta_features = current_bar.get_ta_features(reference_bars)
```
### Issue: Pattern always 'standard'
**Cause**: Candle doesn't meet specific pattern criteria
**Solution**: Check ratios manually
```python
print(f"Body ratio: {bar.get_body_to_range_ratio()}")
print(f"Upper wick: {bar.get_upper_wick_ratio()}")
print(f"Lower wick: {bar.get_lower_wick_ratio()}")
```
### Issue: Slow feature extraction
**Cause**: Computing TA features for many candles
**Solution**: Pre-compute and cache
```python
# Cache in data provider
for bar in ohlcv_list:
if 'ta_cached' not in bar.indicators:
ta = bar.get_ta_features(reference_bars)
bar.indicators.update(ta)
bar.indicators['ta_cached'] = True
```
---
## References
- **Implementation**: `core/data_models.py` - `OHLCVBar` class
- **Usage Guide**: `docs/BASE_DATA_INPUT_USAGE_AUDIT.md`
- **Specification**: `docs/BASE_DATA_INPUT_SPECIFICATION.md`
- **Integration**: `core/standardized_data_provider.py`