Files
gogo2/COBY
Dobromir Popov fd6ec4eb40 api
2025-08-04 18:38:51 +03:00
..
wip
2025-08-04 17:40:30 +03:00
api
2025-08-04 18:38:51 +03:00
2025-08-04 17:55:00 +03:00
2025-08-04 17:28:55 +03:00
2025-08-04 15:50:54 +03:00
2025-08-04 17:28:55 +03:00
2025-08-04 15:50:54 +03:00
2025-08-04 15:50:54 +03:00
2025-08-04 17:28:55 +03:00
2025-08-04 17:12:26 +03:00
2025-08-04 17:55:00 +03:00
2025-08-04 15:50:54 +03:00
2025-08-04 15:50:54 +03:00
wip
2025-08-04 17:40:30 +03:00
2025-08-04 15:50:54 +03:00
2025-08-04 17:12:26 +03:00
2025-08-04 17:12:26 +03:00

COBY - Multi-Exchange Data Aggregation System

COBY (Cryptocurrency Order Book Yielder) is a comprehensive data collection and aggregation subsystem designed to serve as the foundational data layer for trading systems. It collects real-time order book and OHLCV data from multiple cryptocurrency exchanges, aggregates it into standardized formats, and provides both live data feeds and historical replay capabilities.

🏗️ Architecture

The system follows a modular architecture with clear separation of concerns:

COBY/
├── config.py              # Configuration management
├── models/                 # Data models and structures
│   ├── __init__.py
│   └── core.py            # Core data models
├── interfaces/             # Abstract interfaces
│   ├── __init__.py
│   ├── exchange_connector.py
│   ├── data_processor.py
│   ├── aggregation_engine.py
│   ├── storage_manager.py
│   └── replay_manager.py
├── utils/                  # Utility functions
│   ├── __init__.py
│   ├── exceptions.py
│   ├── logging.py
│   ├── validation.py
│   └── timing.py
└── README.md

🚀 Features

  • Multi-Exchange Support: Connect to 10+ major cryptocurrency exchanges
  • Real-Time Data: High-frequency order book and trade data collection
  • Price Bucket Aggregation: Configurable price buckets ($10 for BTC, $1 for ETH)
  • Heatmap Visualization: Real-time market depth heatmaps
  • Historical Replay: Replay past market events for model training
  • TimescaleDB Storage: Optimized time-series data storage
  • Redis Caching: High-performance data caching layer
  • Orchestrator Integration: Compatible with existing trading systems

📊 Data Models

Core Models

  • OrderBookSnapshot: Standardized order book data
  • TradeEvent: Individual trade events
  • PriceBuckets: Aggregated price bucket data
  • HeatmapData: Visualization-ready heatmap data
  • ConnectionStatus: Exchange connection monitoring
  • ReplaySession: Historical data replay management

Key Features

  • Automatic data validation and normalization
  • Configurable price bucket sizes per symbol
  • Real-time metrics calculation
  • Cross-exchange data consolidation
  • Quality scoring and anomaly detection

⚙️ Configuration

The system uses environment variables for configuration:

# Database settings
DB_HOST=192.168.0.10
DB_PORT=5432
DB_NAME=market_data
DB_USER=market_user
DB_PASSWORD=your_password

# Redis settings
REDIS_HOST=192.168.0.10
REDIS_PORT=6379
REDIS_PASSWORD=your_password

# Aggregation settings
BTC_BUCKET_SIZE=10.0
ETH_BUCKET_SIZE=1.0
HEATMAP_DEPTH=50
UPDATE_FREQUENCY=0.5

# Performance settings
DATA_BUFFER_SIZE=10000
BATCH_WRITE_SIZE=1000
MAX_MEMORY_USAGE=2048

🔌 Interfaces

ExchangeConnector

Abstract base class for exchange WebSocket connectors with:

  • Connection management with auto-reconnect
  • Order book and trade subscriptions
  • Data normalization callbacks
  • Health monitoring

DataProcessor

Interface for data processing and validation:

  • Raw data normalization
  • Quality validation
  • Metrics calculation
  • Anomaly detection

AggregationEngine

Interface for data aggregation:

  • Price bucket creation
  • Heatmap generation
  • Cross-exchange consolidation
  • Imbalance calculations

StorageManager

Interface for data persistence:

  • TimescaleDB operations
  • Batch processing
  • Historical data retrieval
  • Storage optimization

ReplayManager

Interface for historical data replay:

  • Session management
  • Configurable playback speeds
  • Time-based seeking
  • Real-time compatibility

🛠️ Utilities

Logging

  • Structured logging with correlation IDs
  • Configurable log levels and outputs
  • Rotating file handlers
  • Context-aware logging

Validation

  • Symbol format validation
  • Price and volume validation
  • Configuration validation
  • Data quality checks

Timing

  • UTC timestamp handling
  • Performance measurement
  • Time-based operations
  • Interval calculations

Exceptions

  • Custom exception hierarchy
  • Error code management
  • Detailed error context
  • Structured error responses

🔧 Usage

Basic Configuration

from COBY.config import config

# Access configuration
db_url = config.get_database_url()
bucket_size = config.get_bucket_size('BTCUSDT')

Data Models

from COBY.models import OrderBookSnapshot, PriceLevel

# Create order book snapshot
orderbook = OrderBookSnapshot(
    symbol='BTCUSDT',
    exchange='binance',
    timestamp=datetime.now(timezone.utc),
    bids=[PriceLevel(50000.0, 1.5)],
    asks=[PriceLevel(50100.0, 2.0)]
)

# Access calculated properties
mid_price = orderbook.mid_price
spread = orderbook.spread

Logging

from COBY.utils import setup_logging, get_logger, set_correlation_id

# Setup logging
setup_logging(level='INFO', log_file='logs/coby.log')

# Get logger
logger = get_logger(__name__)

# Use correlation ID
set_correlation_id('req-123')
logger.info("Processing order book data")

🏃 Next Steps

This is the foundational structure for the COBY system. The next implementation tasks will build upon these interfaces and models to create:

  1. TimescaleDB integration
  2. Exchange connector implementations
  3. Data processing engines
  4. Aggregation algorithms
  5. Web dashboard
  6. API endpoints
  7. Replay functionality

Each component will implement the defined interfaces, ensuring consistency and maintainability across the entire system.

📝 Development Guidelines

  • All components must implement the defined interfaces
  • Use the provided data models for consistency
  • Follow the logging and error handling patterns
  • Validate all input data using the utility functions
  • Maintain backward compatibility with the orchestrator interface
  • Write comprehensive tests for all functionality

🔍 Monitoring

The system provides comprehensive monitoring through:

  • Structured logging with correlation IDs
  • Performance metrics collection
  • Health check endpoints
  • Connection status monitoring
  • Data quality indicators
  • System resource tracking