Files

Dobromir Popov 7ddf98bf18 improved data structure

2025-10-31 00:44:08 +02:00

14 KiB

Raw Permalink Blame History

Candle TA Features Quick Reference

Overview

Enhanced technical analysis features for OHLCVBar class providing comprehensive candle pattern recognition, relative sizing, and body/wick analysis.

Location: core/data_models.py - OHLCVBar class

Quick Start

from core.data_models import OHLCVBar, BaseDataInput
from datetime import datetime

# Create a candle
bar = OHLCVBar(
    symbol='ETH/USDT',
    timestamp=datetime.now(),
    open=2000.0,
    high=2050.0,
    low=1990.0,
    close=2040.0,
    volume=1000.0,
    timeframe='1m'
)

# Check basic properties
print(f"Bullish: {bar.is_bullish}")           # True
print(f"Body size: {bar.body_size}")          # 40.0
print(f"Pattern: {bar.get_candle_pattern()}") # 'standard'

# Get all TA features
reference_bars = [...]  # Previous 10 candles
ta_features = bar.get_ta_features(reference_bars)
print(f"Features: {len(ta_features)}")        # 22 features

Properties (Computed On-Demand)

Basic Measurements

Property	Type	Description	Example
`body_size`	float	Absolute size of candle body	`abs(close - open)`
`upper_wick`	float	Size of upper shadow	`high - max(open, close)`
`lower_wick`	float	Size of lower shadow	`min(open, close) - low`
`total_range`	float	Total high-low range	`high - low`

Candle Type

Property	Type	Description
`is_bullish`	bool	True if close > open (hollow/green)
`is_bearish`	bool	True if close < open (solid/red)
`is_doji`	bool	True if body < 10% of total range

Methods

1. Ratio Calculations

`get_body_to_range_ratio() -> float`

Returns body size as percentage of total range (0.0 to 1.0)

ratio = bar.get_body_to_range_ratio()
# 0.0 = doji (no body)
# 0.5 = body is half the range
# 1.0 = marubozu (all body, no wicks)

`get_upper_wick_ratio() -> float`

Returns upper wick as percentage of total range (0.0 to 1.0)

ratio = bar.get_upper_wick_ratio()
# 0.0 = no upper wick
# 0.5 = upper wick is half the range
# 1.0 = all upper wick (impossible in practice)

`get_lower_wick_ratio() -> float`

Returns lower wick as percentage of total range (0.0 to 1.0)

ratio = bar.get_lower_wick_ratio()
# 0.0 = no lower wick
# 0.5 = lower wick is half the range

2. Relative Sizing

`get_relative_size(reference_bars, method='avg') -> float`

Compare current candle size to reference candles.

Parameters:

reference_bars: List of previous OHLCVBar objects
method: Comparison method
- 'avg': Compare to average (default)
- 'max': Compare to maximum
- 'median': Compare to median

Returns:

1.0 = Same size as reference
> 1.0 = Larger than reference
< 1.0 = Smaller than reference

Example:

# Get last 10 candles
recent = ohlcv_list[-10:]
current = ohlcv_list[-1]

# Compare to average
avg_ratio = current.get_relative_size(recent[:-1], 'avg')
if avg_ratio > 2.0:
    print("Current candle is 2x larger than average!")

# Compare to maximum
max_ratio = current.get_relative_size(recent[:-1], 'max')
if max_ratio > 1.0:
    print("Current candle is the largest!")

3. Pattern Recognition

`get_candle_pattern() -> str`

Identify basic candle pattern.

Patterns Detected:

Pattern	Criteria	Interpretation
`'doji'`	Body < 10% of range	Indecision, potential reversal
`'hammer'`	Small body at top, long lower wick	Bullish reversal signal
`'shooting_star'`	Small body at bottom, long upper wick	Bearish reversal signal
`'spinning_top'`	Small body, both wicks present	Indecision
`'marubozu_bullish'`	Large bullish body (>90% of range)	Strong bullish momentum
`'marubozu_bearish'`	Large bearish body (>90% of range)	Strong bearish momentum
`'standard'`	Regular candle	Normal price action

Example:

pattern = bar.get_candle_pattern()

if pattern == 'hammer':
    print("Potential bullish reversal!")
elif pattern == 'shooting_star':
    print("Potential bearish reversal!")
elif pattern == 'doji':
    print("Market indecision")

Pattern Criteria Details:

# Doji
body_ratio < 0.1

# Marubozu
body_ratio > 0.9

# Hammer
body_ratio < 0.3 and lower_ratio > 0.6 and upper_ratio < 0.1

# Shooting Star
body_ratio < 0.3 and upper_ratio > 0.6 and lower_ratio < 0.1

# Spinning Top
body_ratio < 0.3 and (upper_ratio + lower_ratio) > 0.6

4. Complete TA Feature Set

`get_ta_features(reference_bars=None) -> Dict[str, float]`

Get all technical analysis features as a dictionary.

Parameters:

reference_bars: Optional list of previous bars for relative sizing

Returns: Dictionary with 22 features (or 12 without reference_bars)

Feature Categories:

Basic Properties (3 features)

{
    'is_bullish': 1.0 or 0.0,
    'is_bearish': 1.0 or 0.0,
    'is_doji': 1.0 or 0.0,
}

Size Ratios (3 features)

{
    'body_to_range_ratio': 0.0 to 1.0,
    'upper_wick_ratio': 0.0 to 1.0,
    'lower_wick_ratio': 0.0 to 1.0,
}

Normalized Sizes (4 features)

{
    'body_size_pct': body_size / close,
    'upper_wick_pct': upper_wick / close,
    'lower_wick_pct': lower_wick / close,
    'total_range_pct': total_range / close,
}

Volume Analysis (1 feature)

{
    'volume_per_range': volume / total_range,
}

Relative Sizing (3 features - if reference_bars provided)

{
    'relative_size_avg': ratio vs average,
    'relative_size_max': ratio vs maximum,
    'relative_size_median': ratio vs median,
}

Pattern Encoding (7 features - one-hot)

{
    'pattern_doji': 1.0 or 0.0,
    'pattern_hammer': 1.0 or 0.0,
    'pattern_shooting_star': 1.0 or 0.0,
    'pattern_spinning_top': 1.0 or 0.0,
    'pattern_marubozu_bullish': 1.0 or 0.0,
    'pattern_marubozu_bearish': 1.0 or 0.0,
    'pattern_standard': 1.0 or 0.0,
}

Example:

# Get complete feature set
reference_bars = ohlcv_list[-10:-1]
current_bar = ohlcv_list[-1]

ta_features = current_bar.get_ta_features(reference_bars)

# Access specific features
if ta_features['pattern_hammer'] == 1.0:
    print("Hammer pattern detected!")

if ta_features['relative_size_avg'] > 2.0:
    print("Unusually large candle!")

if ta_features['body_to_range_ratio'] < 0.1:
    print("Doji-like candle (small body)")

Integration with BaseDataInput

Standard Mode (7,850 features)

base_data = data_provider.build_base_data_input('ETH/USDT')
features = base_data.get_feature_vector(include_candle_ta=False)
# Returns: 7,850 features (backward compatible)

Enhanced Mode (22,850 features)

base_data = data_provider.build_base_data_input('ETH/USDT')
features = base_data.get_feature_vector(include_candle_ta=True)
# Returns: 22,850 features (includes 10 TA features per candle)

10 TA Features Per Candle:

is_bullish
body_to_range_ratio
upper_wick_ratio
lower_wick_ratio
body_size_pct
total_range_pct
relative_size_avg
pattern_doji
pattern_hammer
pattern_shooting_star

Total Addition:

ETH: 300 frames × 4 timeframes × 10 features = 12,000 features
BTC: 300 frames × 10 features = 3,000 features
Total: 15,000 additional features

Common Use Cases

1. Detect Reversal Patterns

def scan_for_reversals(ohlcv_list: List[OHLCVBar]) -> List[tuple]:
    """Scan for potential reversal patterns"""
    reversals = []
    
    for i, bar in enumerate(ohlcv_list[-50:]):
        pattern = bar.get_candle_pattern()
        
        if pattern in ['hammer', 'shooting_star']:
            reversals.append((i, bar.timestamp, pattern, bar.close))
    
    return reversals

# Usage
reversals = scan_for_reversals(base_data.ohlcv_1m)
for idx, timestamp, pattern, price in reversals:
    print(f"{timestamp}: {pattern} at ${price:.2f}")

2. Identify Momentum Candles

def find_momentum_candles(ohlcv_list: List[OHLCVBar], 
                          threshold: float = 2.0) -> List[OHLCVBar]:
    """Find unusually large candles indicating momentum"""
    momentum_candles = []
    
    for i in range(10, len(ohlcv_list)):
        current = ohlcv_list[i]
        reference = ohlcv_list[i-10:i]
        
        relative_size = current.get_relative_size(reference, 'avg')
        
        if relative_size > threshold:
            momentum_candles.append(current)
    
    return momentum_candles

# Usage
momentum = find_momentum_candles(base_data.ohlcv_1m, threshold=2.5)
print(f"Found {len(momentum)} momentum candles")

3. Analyze Candle Structure

def analyze_candle_structure(bar: OHLCVBar) -> Dict[str, Any]:
    """Comprehensive candle analysis"""
    return {
        'direction': 'bullish' if bar.is_bullish else 'bearish',
        'pattern': bar.get_candle_pattern(),
        'body_dominance': bar.get_body_to_range_ratio(),
        'upper_wick_dominance': bar.get_upper_wick_ratio(),
        'lower_wick_dominance': bar.get_lower_wick_ratio(),
        'interpretation': _interpret_structure(bar)
    }

def _interpret_structure(bar: OHLCVBar) -> str:
    """Interpret candle structure"""
    body_ratio = bar.get_body_to_range_ratio()
    
    if body_ratio > 0.8:
        return "Strong momentum"
    elif body_ratio < 0.2:
        return "Indecision/consolidation"
    elif bar.get_upper_wick_ratio() > 0.5:
        return "Rejection at highs"
    elif bar.get_lower_wick_ratio() > 0.5:
        return "Support at lows"
    else:
        return "Normal price action"

# Usage
current_bar = base_data.ohlcv_1m[-1]
analysis = analyze_candle_structure(current_bar)
print(f"Pattern: {analysis['pattern']}")
print(f"Interpretation: {analysis['interpretation']}")

4. Build Custom Features

def extract_custom_candle_features(ohlcv_list: List[OHLCVBar], 
                                   window: int = 10) -> np.ndarray:
    """Extract custom candle features for ML model"""
    features = []
    
    for i in range(window, len(ohlcv_list)):
        current = ohlcv_list[i]
        reference = ohlcv_list[i-window:i]
        
        # Get TA features
        ta = current.get_ta_features(reference)
        
        # Custom feature engineering
        features.append([
            ta['is_bullish'],
            ta['body_to_range_ratio'],
            ta['relative_size_avg'],
            ta['pattern_doji'],
            ta['pattern_hammer'],
            ta['pattern_shooting_star'],
            # Add more as needed
        ])
    
    return np.array(features)

# Usage
custom_features = extract_custom_candle_features(base_data.ohlcv_1m)
print(f"Custom features shape: {custom_features.shape}")

Performance Considerations

Computation Time

Operation	Time	Notes
Property access (cached)	~0.001 ms	Very fast
`get_candle_pattern()`	~0.01 ms	Fast
`get_ta_features()`	~0.1 ms	Moderate
Full feature vector (1500 candles)	~150 ms	Can be optimized

Optimization Tips

1. Cache TA Features in OHLCVBar

# When creating OHLCVBar, pre-compute TA features
bar = OHLCVBar(...)
ta_features = bar.get_ta_features(reference_bars)
bar.indicators.update(ta_features)  # Cache in indicators dict

2. Batch Processing

# Process all candles at once
def precompute_ta_features(ohlcv_list: List[OHLCVBar]):
    """Pre-compute TA features for all candles"""
    for i in range(10, len(ohlcv_list)):
        current = ohlcv_list[i]
        reference = ohlcv_list[i-10:i]
        ta = current.get_ta_features(reference)
        current.indicators.update(ta)

3. Lazy Evaluation

# Only compute when needed
if model.requires_candle_ta:
    features = base_data.get_feature_vector(include_candle_ta=True)
else:
    features = base_data.get_feature_vector(include_candle_ta=False)

Testing

Unit Tests

def test_candle_properties():
    bar = OHLCVBar('ETH/USDT', datetime.now(), 2000, 2050, 1990, 2040, 1000, '1m')
    assert bar.is_bullish == True
    assert bar.body_size == 40.0
    assert bar.total_range == 60.0

def test_pattern_recognition():
    doji = OHLCVBar('ETH/USDT', datetime.now(), 2000, 2005, 1995, 2001, 100, '1m')
    assert doji.get_candle_pattern() == 'doji'

def test_relative_sizing():
    bars = [OHLCVBar('ETH/USDT', datetime.now(), 2000, 2010, 1990, 2005, 100, '1m') for _ in range(10)]
    large = OHLCVBar('ETH/USDT', datetime.now(), 2000, 2060, 1980, 2055, 100, '1m')
    assert large.get_relative_size(bars, 'avg') > 2.0

Troubleshooting

Issue: TA features all zeros

Cause: No reference bars provided to get_ta_features()

Solution:

# Provide reference bars
reference_bars = ohlcv_list[-10:-1]
ta_features = current_bar.get_ta_features(reference_bars)

Issue: Pattern always 'standard'

Cause: Candle doesn't meet specific pattern criteria

Solution: Check ratios manually

print(f"Body ratio: {bar.get_body_to_range_ratio()}")
print(f"Upper wick: {bar.get_upper_wick_ratio()}")
print(f"Lower wick: {bar.get_lower_wick_ratio()}")

Issue: Slow feature extraction

Cause: Computing TA features for many candles

Solution: Pre-compute and cache

# Cache in data provider
for bar in ohlcv_list:
    if 'ta_cached' not in bar.indicators:
        ta = bar.get_ta_features(reference_bars)
        bar.indicators.update(ta)
        bar.indicators['ta_cached'] = True

References

Implementation: core/data_models.py - OHLCVBar class
Usage Guide: docs/BASE_DATA_INPUT_USAGE_AUDIT.md
Specification: docs/BASE_DATA_INPUT_SPECIFICATION.md
Integration: core/standardized_data_provider.py

14 KiB Raw Permalink Blame History Unescape Escape

Candle TA Features Quick Reference

Overview

Quick Start

Properties (Computed On-Demand)

Basic Measurements

Candle Type

Methods

1. Ratio Calculations

get_body_to_range_ratio() -> float

get_upper_wick_ratio() -> float

get_lower_wick_ratio() -> float

2. Relative Sizing

get_relative_size(reference_bars, method='avg') -> float

3. Pattern Recognition

get_candle_pattern() -> str

4. Complete TA Feature Set

get_ta_features(reference_bars=None) -> Dict[str, float]

Basic Properties (3 features)

Size Ratios (3 features)

Normalized Sizes (4 features)

Volume Analysis (1 feature)

Relative Sizing (3 features - if reference_bars provided)

Pattern Encoding (7 features - one-hot)

Integration with BaseDataInput

Standard Mode (7,850 features)

Enhanced Mode (22,850 features)

Common Use Cases

1. Detect Reversal Patterns

2. Identify Momentum Candles

3. Analyze Candle Structure

4. Build Custom Features

Performance Considerations

Computation Time

Optimization Tips

1. Cache TA Features in OHLCVBar

2. Batch Processing

3. Lazy Evaluation

Testing

Unit Tests

Troubleshooting

Issue: TA features all zeros

Issue: Pattern always 'standard'

Issue: Slow feature extraction

References

14 KiB

Raw Permalink Blame History

`get_body_to_range_ratio() -> float`

`get_upper_wick_ratio() -> float`

`get_lower_wick_ratio() -> float`

`get_relative_size(reference_bars, method='avg') -> float`

`get_candle_pattern() -> str`

`get_ta_features(reference_bars=None) -> Dict[str, float]`