gogo2/CLEANUP_PLAN.md

# Project Cleanup & Reorganization Plan

## Current Issues
1. **Code Duplication**: Multiple CNN models, RL agents, training scripts doing similar things
2. **Missing Methods**: Core functionality like `run()`, `start_websocket()` missing from classes
3. **Unclear Architecture**: No clean separation between components
4. **Hard to Maintain**: Scattered implementations make changes difficult

## New Clean Architecture

```
gogo2/
├── core/                           # Core system components
│   ├── __init__.py
│   ├── data_provider.py           # Multi-timeframe, multi-symbol data
│   ├── orchestrator.py            # Main decision making module
│   └── config.py                  # Central configuration
├── models/                         # AI/ML Models
│   ├── __init__.py
│   ├── cnn/                       # CNN module
│   │   ├── __init__.py
│   │   ├── model.py               # Single CNN implementation
│   │   ├── trainer.py             # CNN training pipeline
│   │   └── predictor.py           # CNN inference with confidence
│   └── rl/                        # RL module
│       ├── __init__.py
│       ├── agent.py               # Single RL agent implementation
│       ├── environment.py         # Trading environment
│       └── trainer.py             # RL training loop
├── trading/                        # Trading execution
│   ├── __init__.py
│   ├── executor.py                # Trade execution
│   ├── portfolio.py               # Position/portfolio management
│   └── metrics.py                 # Performance tracking
├── web/                           # Web interface
│   ├── __init__.py
│   ├── dashboard.py               # Main dashboard
│   └── charts.py                  # Chart components
├── utils/                         # Utilities
│   ├── __init__.py
│   ├── logger.py                  # Centralized logging
│   └── helpers.py                 # Common helpers
├── main.py                        # Single entry point
├── config.yaml                    # Configuration file
└── requirements.txt               # Dependencies
```

## Core Goals

### 1. Data Provider (`core/data_provider.py`)
- **Multi-symbol support**: ETH/USDT, BTC/USDT (configurable)
- **Multi-timeframe**: 1m, 5m, 15m, 1h, 4h, 1d
- **Real-time streaming**: WebSocket integration
- **Historical data**: API integration for backtesting
- **Clean interface**: Simple methods for getting data

### 2. CNN Module (`models/cnn/`)
- **Single model implementation**: Remove duplicates
- **Timeframe-specific predictions**: Separate predictions per timeframe
- **Confidence scoring**: Each prediction includes confidence
- **Training pipeline**: Supervised learning with marked data (perfect moves)

### 3. RL Module (`models/rl/`)
- **Single agent**: Remove duplicate DQN implementations
- **Environment**: Clean trading simulation
- **Learning loop**: Evaluates trading actions and adapts

### 4. Orchestrator (`core/orchestrator.py`)
- **Decision making**: Combines CNN and RL outputs
- **Final actions**: BUY/SELL/HOLD decisions
- **Confidence weighting**: Uses CNN confidence in decisions

### 5. Web Interface (`web/`)
- **Real-time charts**: Live trading visualization
- **Performance dashboard**: Metrics and analytics
- **Simple & clean**: Remove complex chart implementations

## Cleanup Steps

### Phase 1: Core Infrastructure
1. Create new clean directory structure
2. Implement `core/data_provider.py` (consolidate all data functionality)
3. Implement `core/orchestrator.py` (main decision maker)
4. Create `config.yaml` for all settings

### Phase 2: Model Consolidation
1. Create single `models/cnn/model.py` (consolidate all CNN implementations)
2. Create single `models/rl/agent.py` (consolidate DQN implementations)
3. Remove duplicate model files

### Phase 3: Training Simplification
1. Create `models/cnn/trainer.py` (single CNN training script)
2. Create `models/rl/trainer.py` (single RL training script)
3. Remove all duplicate training scripts

### Phase 4: Web Interface
1. Create clean `web/dashboard.py` (consolidate chart functionality)
2. Remove complex/unused chart implementations

### Phase 5: Integration & Testing
1. Create single `main.py` entry point
2. Test all components work together
3. Remove unused files

## Files to Remove (After consolidation)

### Duplicate Training Scripts
- `train_hybrid.py`
- `train_dqn.py`
- `train_cnn_with_realtime.py`
- `train_with_realtime_ticks.py`
- `train_improved_rl.py`
- `NN/train_enhanced.py`
- `NN/train_rl.py`

### Duplicate Model Files
- `NN/models/cnn_model.py`
- `NN/models/enhanced_cnn.py`
- `NN/models/simple_cnn.py`
- `NN/models/transformer_model.py`
- `NN/models/transformer_model_pytorch.py`
- `NN/models/dqn_agent_enhanced.py`

### Duplicate Main Files
- `trading_main.py`
- `NN/main.py`
- `NN/realtime_main.py`
- `NN/realtime-main.py`

### Unused Utilities
- `launch_training.py`
- `NN/example.py`
- Most logs and backup directories

## Benefits of New Architecture

1. **Single Source of Truth**: One implementation per component
2. **Clear Separation**: CNN, RL, and Orchestrator are distinct
3. **Easy to Extend**: Adding new symbols/timeframes is simple
4. **Maintainable**: Changes are localized to specific modules
5. **Testable**: Each component can be tested independently

## Implementation Priority

1. **HIGH**: Core data provider and orchestrator
2. **HIGH**: Single CNN and RL implementations
3. **MEDIUM**: Web dashboard consolidation
4. **LOW**: Cleanup of unused files

This plan will result in a much cleaner, more maintainable codebase focused on the core goal: multi-modal trading system with CNN predictions and RL decision making.