Files
gogo2/.kiro/steering/structure.md
2025-11-13 15:09:20 +02:00

234 lines
7.6 KiB
Markdown

# Project Structure & Architecture
## Module Organization
### core/ - Core Trading System
Central trading logic and data management.
**Key modules**:
- `orchestrator.py`: Decision coordination, combines CNN/RL predictions
- `data_provider.py`: Real market data fetching (Binance API)
- `data_models.py`: Shared data structures (OHLCV, features, predictions)
- `config.py`: Configuration management
- `trading_executor.py`: Order execution and position management
- `exchanges/`: Exchange-specific implementations (Binance, Bybit, Deribit, MEXC)
**Multi-horizon system**:
- `multi_horizon_prediction_manager.py`: Generates 1m/5m/15m/60m predictions
- `multi_horizon_trainer.py`: Deferred training when outcomes known
- `prediction_snapshot_storage.py`: Efficient prediction storage
**Training**:
- `extrema_trainer.py`: Trains on market extrema (pivots)
- `training_integration.py`: Training pipeline integration
- `overnight_training_coordinator.py`: Scheduled training sessions
### NN/ - Neural Network Models
Deep learning models for pattern recognition and trading decisions.
**models/**:
- `enhanced_cnn.py`: CNN for pattern recognition (100M params)
- `standardized_cnn.py`: Standardized CNN interface
- `advanced_transformer_trading.py`: Transformer for long-range dependencies
- `dqn_agent.py`: Deep Q-Network for RL trading
- `model_interfaces.py`: Abstract interfaces for all models
**training/**:
- Training pipelines for each model type
- Batch processing and optimization
**utils/**:
- `data_interface.py`: Connects to realtime data
- Feature engineering and preprocessing
### COBY/ - Data Aggregation System
Multi-exchange order book and OHLCV data collection.
**Structure**:
- `main.py`: Entry point
- `config.py`: COBY-specific configuration
- `models/core.py`: Data models (OrderBookSnapshot, TradeEvent, PriceBuckets)
- `interfaces/`: Abstract interfaces for connectors, processors, storage
- `api/rest_api.py`: FastAPI REST endpoints
- `web/static/`: Dashboard UI (http://localhost:8080)
- `connectors/`: Exchange WebSocket connectors
- `storage/`: TimescaleDB/Redis integration
- `monitoring/`: System monitoring and metrics
### ANNOTATE/ - Manual Annotation UI
Web interface for marking profitable trades on historical data.
**Structure**:
- `web/app.py`: Flask/Dash application
- `web/templates/`: Jinja2 HTML templates
- `core/annotation_manager.py`: Annotation storage and retrieval
- `core/training_simulator.py`: Simulates training with annotations
- `core/data_loader.py`: Historical data loading
- `data/annotations/`: Saved annotations
- `data/test_cases/`: Generated training test cases
### web/ - Main Dashboard
Real-time monitoring and visualization.
**Key files**:
- `clean_dashboard.py`: Main dashboard application
- `cob_realtime_dashboard.py`: COB-specific dashboard
- `component_manager.py`: UI component management
- `layout_manager.py`: Dashboard layout
- `models_training_panel.py`: Training controls
- `prediction_chart.py`: Prediction visualization
### models/ - Model Checkpoints
Trained model weights and checkpoints.
**Organization**:
- `cnn/`: CNN model checkpoints
- `rl/`: RL model checkpoints
- `enhanced_cnn/`: Enhanced CNN variants
- `enhanced_rl/`: Enhanced RL variants
- `best_models/`: Best performing models
- `checkpoints/`: Training checkpoints
### utils/ - Shared Utilities
Common functionality across modules.
**Key utilities**:
- `checkpoint_manager.py`: Model checkpoint save/load
- `cache_manager.py`: Data caching
- `database_manager.py`: SQLite database operations
- `inference_logger.py`: Prediction logging
- `timezone_utils.py`: Timezone handling
- `training_integration.py`: Training pipeline utilities
### data/ - Data Storage
Databases and cached data.
**Contents**:
- `predictions.db`: SQLite prediction database
- `trading_system.db`: Trading metadata
- `cache/`: Cached market data
- `prediction_snapshots/`: Stored predictions for training
- `text_exports/`: Exported data for analysis
### cache/ - Data Caching
High-performance data caching.
**Contents**:
- `trading_data.duckdb`: DuckDB time-series storage
- `parquet_store/`: Parquet files for efficient storage
- `monthly_1s_data/`: Monthly 1-second data cache
- `pivot_bounds/`: Cached pivot calculations
### @checkpoints/ - Checkpoint Archive
Archived model checkpoints organized by type.
**Organization**:
- `cnn/`, `dqn/`, `hybrid/`, `rl/`, `transformer/`: By model type
- `best_models/`: Best performers
- `archive/`: Historical checkpoints
## Architecture Patterns
### Data Flow
```
Exchange APIs → DataProvider → Orchestrator → Models (CNN/RL/Transformer)
Trading Executor → Exchange APIs
```
### Training Flow
```
Real Market Data → Feature Engineering → Model Training → Checkpoint Save
Validation & Metrics
```
### Multi-Horizon Flow
```
Orchestrator → PredictionManager → Generate predictions (1m/5m/15m/60m)
SnapshotStorage
Wait for target time (deferred)
MultiHorizonTrainer → Train models
```
### COBY Data Flow
```
Exchange WebSockets → Connectors → DataProcessor → AggregationEngine
StorageManager
TimescaleDB + Redis
```
## Dependency Patterns
### Core Dependencies
- `orchestrator.py` depends on: all models, data_provider, trading_executor
- `data_provider.py` depends on: cache_manager, timezone_utils
- Models depend on: data_models, checkpoint_manager
### Dashboard Dependencies
- `clean_dashboard.py` depends on: orchestrator, data_provider, all models
- Uses component_manager and layout_manager for UI
### Circular Dependency Prevention
- Use abstract interfaces (model_interfaces.py)
- Dependency injection for orchestrator
- Lazy imports where needed
## Configuration Hierarchy
1. **config.yaml**: Main system config (exchanges, symbols, trading params)
2. **models.yml**: Model-specific settings (architecture, training)
3. **.env**: Sensitive credentials (API keys, passwords)
4. Module-specific configs in each subsystem (COBY/config.py, etc.)
## Naming Conventions
### Files
- Snake_case for Python files: `data_provider.py`
- Descriptive names: `multi_horizon_prediction_manager.py`
### Classes
- PascalCase: `DataProvider`, `MultiHorizonTrainer`
- Descriptive: `PredictionSnapshotStorage`
### Functions
- Snake_case: `get_ohlcv_data()`, `train_model()`
- Verb-noun pattern: `calculate_features()`, `save_checkpoint()`
### Variables
- Snake_case: `prediction_data`, `model_output`
- Descriptive: `cnn_confidence_threshold`
## Import Patterns
### Absolute imports preferred
```python
from core.data_provider import DataProvider
from NN.models.enhanced_cnn import EnhancedCNN
```
### Relative imports for same package
```python
from .data_models import OHLCV
from ..utils import checkpoint_manager
```
## Testing Structure
- Unit tests in `tests/` directory
- Integration tests: `test_integration.py`
- Component-specific tests: `test_cnn_only.py`, `test_training.py`
- Use pytest framework
## Documentation
- Module-level docstrings in each file
- README.md in major subsystems (COBY/, NN/, ANNOTATE/)
- Architecture docs in root: `COB_MODEL_ARCHITECTURE_DOCUMENTATION.md`, `MULTI_HORIZON_TRAINING_SYSTEM.md`
- Implementation summaries: `IMPLEMENTATION_SUMMARY.md`, `TRAINING_IMPROVEMENTS_SUMMARY.md`