# Project Structure & Architecture ## Module Organization ### core/ - Core Trading System Central trading logic and data management. **Key modules**: - `orchestrator.py`: Decision coordination, combines CNN/RL predictions - `data_provider.py`: Real market data fetching (Binance API) - `data_models.py`: Shared data structures (OHLCV, features, predictions) - `config.py`: Configuration management - `trading_executor.py`: Order execution and position management - `exchanges/`: Exchange-specific implementations (Binance, Bybit, Deribit, MEXC) **Multi-horizon system**: - `multi_horizon_prediction_manager.py`: Generates 1m/5m/15m/60m predictions - `multi_horizon_trainer.py`: Deferred training when outcomes known - `prediction_snapshot_storage.py`: Efficient prediction storage **Training**: - `extrema_trainer.py`: Trains on market extrema (pivots) - `training_integration.py`: Training pipeline integration - `overnight_training_coordinator.py`: Scheduled training sessions ### NN/ - Neural Network Models Deep learning models for pattern recognition and trading decisions. **models/**: - `enhanced_cnn.py`: CNN for pattern recognition (100M params) - `standardized_cnn.py`: Standardized CNN interface - `advanced_transformer_trading.py`: Transformer for long-range dependencies - `dqn_agent.py`: Deep Q-Network for RL trading - `model_interfaces.py`: Abstract interfaces for all models **training/**: - Training pipelines for each model type - Batch processing and optimization **utils/**: - `data_interface.py`: Connects to realtime data - Feature engineering and preprocessing ### COBY/ - Data Aggregation System Multi-exchange order book and OHLCV data collection. **Structure**: - `main.py`: Entry point - `config.py`: COBY-specific configuration - `models/core.py`: Data models (OrderBookSnapshot, TradeEvent, PriceBuckets) - `interfaces/`: Abstract interfaces for connectors, processors, storage - `api/rest_api.py`: FastAPI REST endpoints - `web/static/`: Dashboard UI (http://localhost:8080) - `connectors/`: Exchange WebSocket connectors - `storage/`: TimescaleDB/Redis integration - `monitoring/`: System monitoring and metrics ### ANNOTATE/ - Manual Annotation UI Web interface for marking profitable trades on historical data. **Structure**: - `web/app.py`: Flask/Dash application - `web/templates/`: Jinja2 HTML templates - `core/annotation_manager.py`: Annotation storage and retrieval - `core/training_simulator.py`: Simulates training with annotations - `core/data_loader.py`: Historical data loading - `data/annotations/`: Saved annotations - `data/test_cases/`: Generated training test cases ### web/ - Main Dashboard Real-time monitoring and visualization. **Key files**: - `clean_dashboard.py`: Main dashboard application - `cob_realtime_dashboard.py`: COB-specific dashboard - `component_manager.py`: UI component management - `layout_manager.py`: Dashboard layout - `models_training_panel.py`: Training controls - `prediction_chart.py`: Prediction visualization ### models/ - Model Checkpoints Trained model weights and checkpoints. **Organization**: - `cnn/`: CNN model checkpoints - `rl/`: RL model checkpoints - `enhanced_cnn/`: Enhanced CNN variants - `enhanced_rl/`: Enhanced RL variants - `best_models/`: Best performing models - `checkpoints/`: Training checkpoints ### utils/ - Shared Utilities Common functionality across modules. **Key utilities**: - `checkpoint_manager.py`: Model checkpoint save/load - `cache_manager.py`: Data caching - `database_manager.py`: SQLite database operations - `inference_logger.py`: Prediction logging - `timezone_utils.py`: Timezone handling - `training_integration.py`: Training pipeline utilities ### data/ - Data Storage Databases and cached data. **Contents**: - `predictions.db`: SQLite prediction database - `trading_system.db`: Trading metadata - `cache/`: Cached market data - `prediction_snapshots/`: Stored predictions for training - `text_exports/`: Exported data for analysis ### cache/ - Data Caching High-performance data caching. **Contents**: - `trading_data.duckdb`: DuckDB time-series storage - `parquet_store/`: Parquet files for efficient storage - `monthly_1s_data/`: Monthly 1-second data cache - `pivot_bounds/`: Cached pivot calculations ### @checkpoints/ - Checkpoint Archive Archived model checkpoints organized by type. **Organization**: - `cnn/`, `dqn/`, `hybrid/`, `rl/`, `transformer/`: By model type - `best_models/`: Best performers - `archive/`: Historical checkpoints ## Architecture Patterns ### Data Flow ``` Exchange APIs → DataProvider → Orchestrator → Models (CNN/RL/Transformer) ↓ Trading Executor → Exchange APIs ``` ### Training Flow ``` Real Market Data → Feature Engineering → Model Training → Checkpoint Save ↓ Validation & Metrics ``` ### Multi-Horizon Flow ``` Orchestrator → PredictionManager → Generate predictions (1m/5m/15m/60m) ↓ SnapshotStorage ↓ Wait for target time (deferred) ↓ MultiHorizonTrainer → Train models ``` ### COBY Data Flow ``` Exchange WebSockets → Connectors → DataProcessor → AggregationEngine ↓ StorageManager ↓ TimescaleDB + Redis ``` ## Dependency Patterns ### Core Dependencies - `orchestrator.py` depends on: all models, data_provider, trading_executor - `data_provider.py` depends on: cache_manager, timezone_utils - Models depend on: data_models, checkpoint_manager ### Dashboard Dependencies - `clean_dashboard.py` depends on: orchestrator, data_provider, all models - Uses component_manager and layout_manager for UI ### Circular Dependency Prevention - Use abstract interfaces (model_interfaces.py) - Dependency injection for orchestrator - Lazy imports where needed ## Configuration Hierarchy 1. **config.yaml**: Main system config (exchanges, symbols, trading params) 2. **models.yml**: Model-specific settings (architecture, training) 3. **.env**: Sensitive credentials (API keys, passwords) 4. Module-specific configs in each subsystem (COBY/config.py, etc.) ## Naming Conventions ### Files - Snake_case for Python files: `data_provider.py` - Descriptive names: `multi_horizon_prediction_manager.py` ### Classes - PascalCase: `DataProvider`, `MultiHorizonTrainer` - Descriptive: `PredictionSnapshotStorage` ### Functions - Snake_case: `get_ohlcv_data()`, `train_model()` - Verb-noun pattern: `calculate_features()`, `save_checkpoint()` ### Variables - Snake_case: `prediction_data`, `model_output` - Descriptive: `cnn_confidence_threshold` ## Import Patterns ### Absolute imports preferred ```python from core.data_provider import DataProvider from NN.models.enhanced_cnn import EnhancedCNN ``` ### Relative imports for same package ```python from .data_models import OHLCV from ..utils import checkpoint_manager ``` ## Testing Structure - Unit tests in `tests/` directory - Integration tests: `test_integration.py` - Component-specific tests: `test_cnn_only.py`, `test_training.py` - Use pytest framework ## Documentation - Module-level docstrings in each file - README.md in major subsystems (COBY/, NN/, ANNOTATE/) - Architecture docs in root: `COB_MODEL_ARCHITECTURE_DOCUMENTATION.md`, `MULTI_HORIZON_TRAINING_SYSTEM.md` - Implementation summaries: `IMPLEMENTATION_SUMMARY.md`, `TRAINING_IMPROVEMENTS_SUMMARY.md`