COBY : specs + task 1
This commit is contained in:
231
COBY/README.md
Normal file
231
COBY/README.md
Normal file
@ -0,0 +1,231 @@
|
||||
# COBY - Multi-Exchange Data Aggregation System
|
||||
|
||||
COBY (Cryptocurrency Order Book Yielder) is a comprehensive data collection and aggregation subsystem designed to serve as the foundational data layer for trading systems. It collects real-time order book and OHLCV data from multiple cryptocurrency exchanges, aggregates it into standardized formats, and provides both live data feeds and historical replay capabilities.
|
||||
|
||||
## 🏗️ Architecture
|
||||
|
||||
The system follows a modular architecture with clear separation of concerns:
|
||||
|
||||
```
|
||||
COBY/
|
||||
├── config.py # Configuration management
|
||||
├── models/ # Data models and structures
|
||||
│ ├── __init__.py
|
||||
│ └── core.py # Core data models
|
||||
├── interfaces/ # Abstract interfaces
|
||||
│ ├── __init__.py
|
||||
│ ├── exchange_connector.py
|
||||
│ ├── data_processor.py
|
||||
│ ├── aggregation_engine.py
|
||||
│ ├── storage_manager.py
|
||||
│ └── replay_manager.py
|
||||
├── utils/ # Utility functions
|
||||
│ ├── __init__.py
|
||||
│ ├── exceptions.py
|
||||
│ ├── logging.py
|
||||
│ ├── validation.py
|
||||
│ └── timing.py
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## 🚀 Features
|
||||
|
||||
- **Multi-Exchange Support**: Connect to 10+ major cryptocurrency exchanges
|
||||
- **Real-Time Data**: High-frequency order book and trade data collection
|
||||
- **Price Bucket Aggregation**: Configurable price buckets ($10 for BTC, $1 for ETH)
|
||||
- **Heatmap Visualization**: Real-time market depth heatmaps
|
||||
- **Historical Replay**: Replay past market events for model training
|
||||
- **TimescaleDB Storage**: Optimized time-series data storage
|
||||
- **Redis Caching**: High-performance data caching layer
|
||||
- **Orchestrator Integration**: Compatible with existing trading systems
|
||||
|
||||
## 📊 Data Models
|
||||
|
||||
### Core Models
|
||||
|
||||
- **OrderBookSnapshot**: Standardized order book data
|
||||
- **TradeEvent**: Individual trade events
|
||||
- **PriceBuckets**: Aggregated price bucket data
|
||||
- **HeatmapData**: Visualization-ready heatmap data
|
||||
- **ConnectionStatus**: Exchange connection monitoring
|
||||
- **ReplaySession**: Historical data replay management
|
||||
|
||||
### Key Features
|
||||
|
||||
- Automatic data validation and normalization
|
||||
- Configurable price bucket sizes per symbol
|
||||
- Real-time metrics calculation
|
||||
- Cross-exchange data consolidation
|
||||
- Quality scoring and anomaly detection
|
||||
|
||||
## ⚙️ Configuration
|
||||
|
||||
The system uses environment variables for configuration:
|
||||
|
||||
```python
|
||||
# Database settings
|
||||
DB_HOST=192.168.0.10
|
||||
DB_PORT=5432
|
||||
DB_NAME=market_data
|
||||
DB_USER=market_user
|
||||
DB_PASSWORD=your_password
|
||||
|
||||
# Redis settings
|
||||
REDIS_HOST=192.168.0.10
|
||||
REDIS_PORT=6379
|
||||
REDIS_PASSWORD=your_password
|
||||
|
||||
# Aggregation settings
|
||||
BTC_BUCKET_SIZE=10.0
|
||||
ETH_BUCKET_SIZE=1.0
|
||||
HEATMAP_DEPTH=50
|
||||
UPDATE_FREQUENCY=0.5
|
||||
|
||||
# Performance settings
|
||||
DATA_BUFFER_SIZE=10000
|
||||
BATCH_WRITE_SIZE=1000
|
||||
MAX_MEMORY_USAGE=2048
|
||||
```
|
||||
|
||||
## 🔌 Interfaces
|
||||
|
||||
### ExchangeConnector
|
||||
Abstract base class for exchange WebSocket connectors with:
|
||||
- Connection management with auto-reconnect
|
||||
- Order book and trade subscriptions
|
||||
- Data normalization callbacks
|
||||
- Health monitoring
|
||||
|
||||
### DataProcessor
|
||||
Interface for data processing and validation:
|
||||
- Raw data normalization
|
||||
- Quality validation
|
||||
- Metrics calculation
|
||||
- Anomaly detection
|
||||
|
||||
### AggregationEngine
|
||||
Interface for data aggregation:
|
||||
- Price bucket creation
|
||||
- Heatmap generation
|
||||
- Cross-exchange consolidation
|
||||
- Imbalance calculations
|
||||
|
||||
### StorageManager
|
||||
Interface for data persistence:
|
||||
- TimescaleDB operations
|
||||
- Batch processing
|
||||
- Historical data retrieval
|
||||
- Storage optimization
|
||||
|
||||
### ReplayManager
|
||||
Interface for historical data replay:
|
||||
- Session management
|
||||
- Configurable playback speeds
|
||||
- Time-based seeking
|
||||
- Real-time compatibility
|
||||
|
||||
## 🛠️ Utilities
|
||||
|
||||
### Logging
|
||||
- Structured logging with correlation IDs
|
||||
- Configurable log levels and outputs
|
||||
- Rotating file handlers
|
||||
- Context-aware logging
|
||||
|
||||
### Validation
|
||||
- Symbol format validation
|
||||
- Price and volume validation
|
||||
- Configuration validation
|
||||
- Data quality checks
|
||||
|
||||
### Timing
|
||||
- UTC timestamp handling
|
||||
- Performance measurement
|
||||
- Time-based operations
|
||||
- Interval calculations
|
||||
|
||||
### Exceptions
|
||||
- Custom exception hierarchy
|
||||
- Error code management
|
||||
- Detailed error context
|
||||
- Structured error responses
|
||||
|
||||
## 🔧 Usage
|
||||
|
||||
### Basic Configuration
|
||||
|
||||
```python
|
||||
from COBY.config import config
|
||||
|
||||
# Access configuration
|
||||
db_url = config.get_database_url()
|
||||
bucket_size = config.get_bucket_size('BTCUSDT')
|
||||
```
|
||||
|
||||
### Data Models
|
||||
|
||||
```python
|
||||
from COBY.models import OrderBookSnapshot, PriceLevel
|
||||
|
||||
# Create order book snapshot
|
||||
orderbook = OrderBookSnapshot(
|
||||
symbol='BTCUSDT',
|
||||
exchange='binance',
|
||||
timestamp=datetime.now(timezone.utc),
|
||||
bids=[PriceLevel(50000.0, 1.5)],
|
||||
asks=[PriceLevel(50100.0, 2.0)]
|
||||
)
|
||||
|
||||
# Access calculated properties
|
||||
mid_price = orderbook.mid_price
|
||||
spread = orderbook.spread
|
||||
```
|
||||
|
||||
### Logging
|
||||
|
||||
```python
|
||||
from COBY.utils import setup_logging, get_logger, set_correlation_id
|
||||
|
||||
# Setup logging
|
||||
setup_logging(level='INFO', log_file='logs/coby.log')
|
||||
|
||||
# Get logger
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Use correlation ID
|
||||
set_correlation_id('req-123')
|
||||
logger.info("Processing order book data")
|
||||
```
|
||||
|
||||
## 🏃 Next Steps
|
||||
|
||||
This is the foundational structure for the COBY system. The next implementation tasks will build upon these interfaces and models to create:
|
||||
|
||||
1. TimescaleDB integration
|
||||
2. Exchange connector implementations
|
||||
3. Data processing engines
|
||||
4. Aggregation algorithms
|
||||
5. Web dashboard
|
||||
6. API endpoints
|
||||
7. Replay functionality
|
||||
|
||||
Each component will implement the defined interfaces, ensuring consistency and maintainability across the entire system.
|
||||
|
||||
## 📝 Development Guidelines
|
||||
|
||||
- All components must implement the defined interfaces
|
||||
- Use the provided data models for consistency
|
||||
- Follow the logging and error handling patterns
|
||||
- Validate all input data using the utility functions
|
||||
- Maintain backward compatibility with the orchestrator interface
|
||||
- Write comprehensive tests for all functionality
|
||||
|
||||
## 🔍 Monitoring
|
||||
|
||||
The system provides comprehensive monitoring through:
|
||||
- Structured logging with correlation IDs
|
||||
- Performance metrics collection
|
||||
- Health check endpoints
|
||||
- Connection status monitoring
|
||||
- Data quality indicators
|
||||
- System resource tracking
|
Reference in New Issue
Block a user