Files
gogo2/COBY/README.md
2025-08-04 15:50:54 +03:00

231 lines
6.1 KiB
Markdown

# COBY - Multi-Exchange Data Aggregation System
COBY (Cryptocurrency Order Book Yielder) is a comprehensive data collection and aggregation subsystem designed to serve as the foundational data layer for trading systems. It collects real-time order book and OHLCV data from multiple cryptocurrency exchanges, aggregates it into standardized formats, and provides both live data feeds and historical replay capabilities.
## 🏗️ Architecture
The system follows a modular architecture with clear separation of concerns:
```
COBY/
├── config.py # Configuration management
├── models/ # Data models and structures
│ ├── __init__.py
│ └── core.py # Core data models
├── interfaces/ # Abstract interfaces
│ ├── __init__.py
│ ├── exchange_connector.py
│ ├── data_processor.py
│ ├── aggregation_engine.py
│ ├── storage_manager.py
│ └── replay_manager.py
├── utils/ # Utility functions
│ ├── __init__.py
│ ├── exceptions.py
│ ├── logging.py
│ ├── validation.py
│ └── timing.py
└── README.md
```
## 🚀 Features
- **Multi-Exchange Support**: Connect to 10+ major cryptocurrency exchanges
- **Real-Time Data**: High-frequency order book and trade data collection
- **Price Bucket Aggregation**: Configurable price buckets ($10 for BTC, $1 for ETH)
- **Heatmap Visualization**: Real-time market depth heatmaps
- **Historical Replay**: Replay past market events for model training
- **TimescaleDB Storage**: Optimized time-series data storage
- **Redis Caching**: High-performance data caching layer
- **Orchestrator Integration**: Compatible with existing trading systems
## 📊 Data Models
### Core Models
- **OrderBookSnapshot**: Standardized order book data
- **TradeEvent**: Individual trade events
- **PriceBuckets**: Aggregated price bucket data
- **HeatmapData**: Visualization-ready heatmap data
- **ConnectionStatus**: Exchange connection monitoring
- **ReplaySession**: Historical data replay management
### Key Features
- Automatic data validation and normalization
- Configurable price bucket sizes per symbol
- Real-time metrics calculation
- Cross-exchange data consolidation
- Quality scoring and anomaly detection
## ⚙️ Configuration
The system uses environment variables for configuration:
```python
# Database settings
DB_HOST=192.168.0.10
DB_PORT=5432
DB_NAME=market_data
DB_USER=market_user
DB_PASSWORD=your_password
# Redis settings
REDIS_HOST=192.168.0.10
REDIS_PORT=6379
REDIS_PASSWORD=your_password
# Aggregation settings
BTC_BUCKET_SIZE=10.0
ETH_BUCKET_SIZE=1.0
HEATMAP_DEPTH=50
UPDATE_FREQUENCY=0.5
# Performance settings
DATA_BUFFER_SIZE=10000
BATCH_WRITE_SIZE=1000
MAX_MEMORY_USAGE=2048
```
## 🔌 Interfaces
### ExchangeConnector
Abstract base class for exchange WebSocket connectors with:
- Connection management with auto-reconnect
- Order book and trade subscriptions
- Data normalization callbacks
- Health monitoring
### DataProcessor
Interface for data processing and validation:
- Raw data normalization
- Quality validation
- Metrics calculation
- Anomaly detection
### AggregationEngine
Interface for data aggregation:
- Price bucket creation
- Heatmap generation
- Cross-exchange consolidation
- Imbalance calculations
### StorageManager
Interface for data persistence:
- TimescaleDB operations
- Batch processing
- Historical data retrieval
- Storage optimization
### ReplayManager
Interface for historical data replay:
- Session management
- Configurable playback speeds
- Time-based seeking
- Real-time compatibility
## 🛠️ Utilities
### Logging
- Structured logging with correlation IDs
- Configurable log levels and outputs
- Rotating file handlers
- Context-aware logging
### Validation
- Symbol format validation
- Price and volume validation
- Configuration validation
- Data quality checks
### Timing
- UTC timestamp handling
- Performance measurement
- Time-based operations
- Interval calculations
### Exceptions
- Custom exception hierarchy
- Error code management
- Detailed error context
- Structured error responses
## 🔧 Usage
### Basic Configuration
```python
from COBY.config import config
# Access configuration
db_url = config.get_database_url()
bucket_size = config.get_bucket_size('BTCUSDT')
```
### Data Models
```python
from COBY.models import OrderBookSnapshot, PriceLevel
# Create order book snapshot
orderbook = OrderBookSnapshot(
symbol='BTCUSDT',
exchange='binance',
timestamp=datetime.now(timezone.utc),
bids=[PriceLevel(50000.0, 1.5)],
asks=[PriceLevel(50100.0, 2.0)]
)
# Access calculated properties
mid_price = orderbook.mid_price
spread = orderbook.spread
```
### Logging
```python
from COBY.utils import setup_logging, get_logger, set_correlation_id
# Setup logging
setup_logging(level='INFO', log_file='logs/coby.log')
# Get logger
logger = get_logger(__name__)
# Use correlation ID
set_correlation_id('req-123')
logger.info("Processing order book data")
```
## 🏃 Next Steps
This is the foundational structure for the COBY system. The next implementation tasks will build upon these interfaces and models to create:
1. TimescaleDB integration
2. Exchange connector implementations
3. Data processing engines
4. Aggregation algorithms
5. Web dashboard
6. API endpoints
7. Replay functionality
Each component will implement the defined interfaces, ensuring consistency and maintainability across the entire system.
## 📝 Development Guidelines
- All components must implement the defined interfaces
- Use the provided data models for consistency
- Follow the logging and error handling patterns
- Validate all input data using the utility functions
- Maintain backward compatibility with the orchestrator interface
- Write comprehensive tests for all functionality
## 🔍 Monitoring
The system provides comprehensive monitoring through:
- Structured logging with correlation IDs
- Performance metrics collection
- Health check endpoints
- Connection status monitoring
- Data quality indicators
- System resource tracking