new realt module

2025-03-25 12:48:58 +02:00
parent 9e81e86b1c
commit 114ced03b7
9 changed files with 1055 additions and 0 deletions
--- a/.vscode/launch.json
+++ b/.vscode/launch.json
@ -103,6 +103,51 @@
            "env": {
                "PYTHONUNBUFFERED": "1"
            }
+        },
+        {
+            "name": "NN Training Pipeline",
+            "type": "python",
+            "request": "launch",
+            "program": "-m",
+            "args": [
+                "NN.main",
+                "--mode",
+                "train",
+                "--symbol",
+                "BTC/USDT",
+                "--timeframes",
+                "1m", "5m", "1h", "4h",
+                "--epochs",
+                "100",
+                "--batch_size",
+                "64",
+                "--window_size",
+                "30",
+                "--output_size",
+                "3"
+            ],
+            "console": "integratedTerminal",
+            "justMyCode": true,
+            "env": {
+                "PYTHONUNBUFFERED": "1",
+                "TF_CPP_MIN_LOG_LEVEL": "2"
+            },
+            "postDebugTask": "Start TensorBoard"
+        },
+        {
+            "name": "Realtime Charts with NN Inference",
+            "type": "python",
+            "request": "launch",
+            "program": "realtime.py",
+            "console": "integratedTerminal",
+            "justMyCode": true,
+            "env": {
+                "PYTHONUNBUFFERED": "1",
+                "ENABLE_NN_MODELS": "1",
+                "NN_INFERENCE_INTERVAL": "60",
+                "NN_MODEL_TYPE": "cnn",
+                "NN_TIMEFRAME": "1h"
+            }
        }
    ]
 }
--- a/NN/README.md
+++ b/NN/README.md
@ -0,0 +1,131 @@
+# Neural Network Trading System
+
+A comprehensive neural network trading system that uses deep learning models to analyze cryptocurrency price data and generate trading signals.
+
+## Architecture Overview
+
+This project implements a 500M parameter neural network system using a Mixture of Experts (MoE) approach. The system consists of:
+
+1. **Data Interface**: Connects to real-time trading data from `realtime.py` and processes it for the neural network models
+2. **CNN Module (100M parameters)**: A deep convolutional neural network for feature extraction from time series data
+3. **Transformer Module**: Processes high-level features and raw data for improved pattern recognition
+4. **Mixture of Experts (MoE)**: Coordinates the different models and combines their predictions
+
+The system is designed to identify buy/sell opportunities in cryptocurrency markets by analyzing patterns in historical price and volume data.
+
+## Components
+
+### Data Interface
+
+- Located in `NN/utils/data_interface.py`
+- Provides seamless access to historical and real-time data from `realtime.py`
+- Preprocesses data for neural network consumption
+- Supports multiple timeframes and features
+
+### CNN Model
+
+- Located in `NN/models/cnn_model.py`
+- Implements a deep convolutional network for time series analysis
+- Uses multiple parallel convolutional layers to detect patterns at different time scales
+- Includes bidirectional LSTM layers for sequence modeling
+- Optimized for financial time series data
+
+### Transformer Model
+
+- Located in `NN/models/transformer_model.py`
+- Uses self-attention mechanism to process time series data
+- Takes both raw data and high-level features from the CNN as input
+- Better at capturing long-range dependencies in the data
+
+### Orchestrator
+
+- Located in `NN/main.py`
+- Coordinates data flow between the models
+- Implements training and inference pipelines
+- Provides a unified interface for the entire system
+
+## Usage
+
+### Requirements
+
+- TensorFlow 2.x
+- NumPy
+- Pandas
+- Matplotlib
+- scikit-learn
+
+### Training the Model
+
+To train the neural network on historical data:
+
+```bash
+python -m NN.main --mode train --symbol BTC/USDT --timeframes 1h 4h 1d --epochs 100
+```
+
+### Making Predictions
+
+To make one-time predictions:
+
+```bash
+python -m NN.main --mode predict --symbol BTC/USDT --timeframe 1h --model_type cnn
+```
+
+### Running Real-time Analysis
+
+To continuously analyze the market and generate signals:
+
+```bash
+python -m NN.main --mode realtime --symbol BTC/USDT --timeframe 1h --interval 60
+```
+
+## Model Architecture Details
+
+### CNN Architecture
+
+The CNN model uses a multi-scale approach with three parallel convolutional pathways:
+- Short-term patterns: 3x1 kernels
+- Medium-term patterns: 5x1 kernels
+- Long-term patterns: 7x1 kernels
+
+These pathways are merged and processed through deeper convolutional layers, followed by LSTM layers to capture temporal dependencies.
+
+### Transformer Architecture
+
+The transformer model uses:
+- Multi-head self-attention layers to capture relationships between different time points
+- Layer normalization and residual connections for stable training
+- A feed-forward network for final classification/regression
+
+### Mixture of Experts
+
+The MoE model:
+- Combines predictions from CNN and Transformer models
+- Uses a weighted average approach for signal generation
+- Can be extended with additional expert models
+
+## Training Data
+
+The system uses historical OHLCV (Open, High, Low, Close, Volume) data at different timeframes:
+- 1-minute candles for short-term analysis
+- 1-hour candles for medium-term trends
+- 1-day candles for long-term market direction
+
+## Output
+
+The system generates one of three signals:
+- BUY: Indicates a potential buying opportunity
+- HOLD: Suggests maintaining current position
+- SELL: Indicates a potential selling opportunity
+
+## Development
+
+### Adding New Models
+
+To add a new model type:
+1. Create a new class in the `NN/models` directory
+2. Implement the required interface (build_model, train, predict, etc.)
+3. Update the orchestrator to include the new model
+
+### Customizing Parameters
+
+Key parameters can be customized through command-line arguments or by modifying the configuration in `main.py`. 
--- a/NN/init.py
+++ b/NN/init.py
@ -0,0 +1,16 @@
+"""
+Neural Network Trading System
+============================
+
+A comprehensive neural network trading system that uses deep learning models
+to analyze cryptocurrency price data and generate trading signals.
+
+The system consists of:
+1. Data Interface: Connects to realtime trading data
+2. CNN Model: Deep convolutional neural network for feature extraction
+3. Transformer Model: Processes high-level features for improved pattern recognition
+4. MoE: Mixture of Experts model that combines multiple neural networks
+"""
+
+__version__ = '0.1.0'
+__author__ = 'Gogo2 Project' 
--- a/NN/data/init.py
+++ b/NN/data/init.py
@ -0,0 +1,11 @@
+"""
+Neural Network Data
+=================
+
+This package is used to store datasets and model outputs.
+It does not contain any code, but serves as a storage location for:
+- Training datasets
+- Evaluation results
+- Inference outputs
+- Model checkpoints
+""" 
--- a/NN/example.py
+++ b/NN/example.py
@ -0,0 +1,261 @@
+#!/usr/bin/env python
+"""
+Example script for the Neural Network Trading System
+This shows basic usage patterns for the system components
+"""
+
+import os
+import sys
+import numpy as np
+import pandas as pd
+import tensorflow as tf
+import matplotlib.pyplot as plt
+from datetime import datetime
+import logging
+
+# Add project root to path
+sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+
+# Import components
+from NN.utils.data_interface import DataInterface
+from NN.models.cnn_model import CNNModel
+from NN.models.transformer_model import TransformerModel, MixtureOfExpertsModel
+from NN.main import NeuralNetworkOrchestrator
+
+# Configure logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+
+logger = logging.getLogger('example')
+
+def example_data_interface():
+    """Show how to use the data interface"""
+    logger.info("=== Data Interface Example ===")
+    
+    # Initialize data interface
+    di = DataInterface(symbol="BTC/USDT", timeframes=['1h', '4h', '1d'])
+    
+    # Get historical data
+    df_1h = di.get_historical_data(timeframe='1h', n_candles=100)
+    if df_1h is not None and not df_1h.empty:
+        logger.info(f"Retrieved {len(df_1h)} 1-hour candles")
+        logger.info(f"Most recent candle: {df_1h.iloc[-1]}")
+    
+    # Prepare data for neural network
+    X, y, timestamps = di.prepare_nn_input(timeframes=['1h'], n_candles=500, window_size=20)
+    if X is not None and y is not None:
+        logger.info(f"Prepared input shape: {X.shape}, target shape: {y.shape}")
+    
+    # Generate a dataset
+    dataset = di.generate_training_dataset(
+        timeframes=['1h', '4h'],
+        n_candles=1000,
+        window_size=20
+    )
+    if dataset:
+        logger.info(f"Dataset generated and saved to: {list(dataset.values())}")
+    
+    return X, y, timestamps if X is not None else (None, None, None)
+
+def example_cnn_model(X=None, y=None):
+    """Show how to use the CNN model"""
+    logger.info("=== CNN Model Example ===")
+    
+    # If no data provided, create dummy data
+    if X is None or y is None:
+        logger.info("Creating dummy data for CNN example")
+        X = np.random.random((1000, 20, 5))  # 1000 samples, 20 time steps, 5 features
+        y = np.random.randint(0, 2, size=(1000,))  # Binary labels
+    
+    # Split data into training and testing sets
+    from sklearn.model_selection import train_test_split
+    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
+    
+    # Initialize and build the CNN model
+    cnn = CNNModel(input_shape=(20, 5), output_size=1, model_dir='NN/models/saved')
+    cnn.build_model(filters=(32, 64, 128), kernel_sizes=(3, 5, 7), dropout_rate=0.3)
+    
+    # Train the model (very small number of epochs for this example)
+    history = cnn.train(
+        X_train, y_train,
+        batch_size=32,
+        epochs=5,  # Just a few epochs for the example
+        validation_split=0.2
+    )
+    
+    # Evaluate the model
+    metrics = cnn.evaluate(X_test, y_test, plot_results=True)
+    if metrics:
+        logger.info(f"CNN Evaluation metrics: {metrics}")
+    
+    # Make a prediction
+    y_pred, y_proba = cnn.predict(X_test[:1])
+    logger.info(f"CNN Prediction: {y_pred[0]}, Probability: {y_proba[0]:.4f}")
+    
+    return cnn
+
+def example_transformer_model(X=None, y=None, cnn_model=None):
+    """Show how to use the Transformer model"""
+    logger.info("=== Transformer Model Example ===")
+    
+    # If no data provided, create dummy data
+    if X is None or y is None:
+        logger.info("Creating dummy data for Transformer example")
+        X = np.random.random((1000, 20, 5))  # 1000 samples, 20 time steps, 5 features
+        y = np.random.randint(0, 2, size=(1000,))  # Binary labels
+    
+    # Generate high-level features (from CNN model or random if no CNN provided)
+    if cnn_model is not None and hasattr(cnn_model, 'extract_hidden_features'):
+        # Extract features from CNN model
+        X_features = cnn_model.extract_hidden_features(X)
+        logger.info(f"Extracted {X_features.shape[1]} features from CNN model")
+    else:
+        # Generate random features
+        X_features = np.random.random((len(X), 128))
+        logger.info("Generated random features for Transformer model")
+    
+    # Split data into training and testing sets
+    from sklearn.model_selection import train_test_split
+    X_train, X_test, X_feat_train, X_feat_test, y_train, y_test = train_test_split(
+        X, X_features, y, test_size=0.2, random_state=42
+    )
+    
+    # Initialize and build the Transformer model
+    transformer = TransformerModel(
+        ts_input_shape=(20, 5),
+        feature_input_shape=X_features.shape[1],
+        output_size=1,
+        model_dir='NN/models/saved'
+    )
+    transformer.build_model(
+        embed_dim=32,
+        num_heads=2,
+        ff_dim=64,
+        num_transformer_blocks=2,
+        dropout_rate=0.2
+    )
+    
+    # Train the model (very small number of epochs for this example)
+    history = transformer.train(
+        X_train, X_feat_train, y_train,
+        batch_size=32,
+        epochs=5,  # Just a few epochs for the example
+        validation_split=0.2
+    )
+    
+    # Make a prediction
+    y_pred, y_proba = transformer.predict(X_test[:1], X_feat_test[:1])
+    logger.info(f"Transformer Prediction: {y_pred[0]}, Probability: {y_proba[0]:.4f}")
+    
+    return transformer
+
+def example_moe_model(X=None, y=None, cnn_model=None, transformer_model=None):
+    """Show how to use the Mixture of Experts model"""
+    logger.info("=== Mixture of Experts Example ===")
+    
+    # If no data provided, create dummy data
+    if X is None or y is None:
+        logger.info("Creating dummy data for MoE example")
+        X = np.random.random((1000, 20, 5))  # 1000 samples, 20 time steps, 5 features
+        y = np.random.randint(0, 2, size=(1000,))  # Binary labels
+    
+    # If models not provided, create them
+    if cnn_model is None:
+        logger.info("Creating a new CNN model for MoE")
+        cnn_model = CNNModel(input_shape=(20, 5), output_size=1)
+        cnn_model.build_model()
+    
+    if transformer_model is None:
+        logger.info("Creating a new Transformer model for MoE")
+        transformer_model = TransformerModel(ts_input_shape=(20, 5), feature_input_shape=128, output_size=1)
+        transformer_model.build_model()
+    
+    # Initialize MoE model
+    moe = MixtureOfExpertsModel(output_size=1, model_dir='NN/models/saved')
+    
+    # Add expert models
+    moe.add_expert('cnn', cnn_model)
+    moe.add_expert('transformer', transformer_model)
+    
+    # Build the MoE model (this is a simplified implementation - in a real scenario
+    # you would need to handle the interfaces between models more carefully)
+    moe.build_model(
+        ts_input_shape=(20, 5),
+        expert_weights={'cnn': 0.7, 'transformer': 0.3}
+    )
+    
+    # In a real implementation, you would train the MoE model here
+    logger.info("MoE model built - in a real implementation, you would train it here")
+    
+    return moe
+
+def example_orchestrator():
+    """Show how to use the Orchestrator"""
+    logger.info("=== Orchestrator Example ===")
+    
+    # Configure the orchestrator
+    config = {
+        'symbol': 'BTC/USDT',
+        'timeframes': ['1h', '4h'],
+        'window_size': 20,
+        'n_features': 5,
+        'output_size': 3,  # BUY/HOLD/SELL
+        'batch_size': 32,
+        'epochs': 5,  # Small number for example
+        'model_dir': 'NN/models/saved',
+        'data_dir': 'NN/data'
+    }
+    
+    # Initialize the orchestrator
+    orchestrator = NeuralNetworkOrchestrator(config)
+    
+    # Prepare training data
+    X, y, timestamps = orchestrator.prepare_training_data(
+        timeframes=['1h'],
+        n_candles=200
+    )
+    
+    if X is not None and y is not None:
+        logger.info(f"Prepared training data: X shape {X.shape}, y shape {y.shape}")
+        
+        # Train CNN model
+        logger.info("Training CNN model with orchestrator...")
+        history = orchestrator.train_cnn_model(X, y, epochs=2)  # Very small for example
+        
+        # Make a prediction
+        result = orchestrator.run_inference_pipeline(
+            model_type='cnn',
+            timeframe='1h'
+        )
+        
+        if result:
+            logger.info(f"Inference result: {result}")
+    else:
+        logger.warning("Could not prepare training data - this is expected if no real data is available")
+        logger.info("The orchestrator would normally handle training and inference")
+
+def main():
+    """Run all examples"""
+    logger.info("Starting Neural Network Trading System Examples")
+    
+    # Example 1: Data Interface
+    X, y, timestamps = example_data_interface()
+    
+    # Example 2: CNN Model
+    cnn_model = example_cnn_model(X, y)
+    
+    # Example 3: Transformer Model
+    transformer_model = example_transformer_model(X, y, cnn_model)
+    
+    # Example 4: Mixture of Experts
+    moe_model = example_moe_model(X, y, cnn_model, transformer_model)
+    
+    # Example 5: Orchestrator
+    example_orchestrator()
+    
+    logger.info("Examples completed")
+
+if __name__ == "__main__":
+    main() 
--- a/NN/models/init.py
+++ b/NN/models/init.py
@ -0,0 +1,14 @@
+"""
+Neural Network Models
+====================
+
+This package contains the neural network models used in the trading system:
+- CNN Model: Deep convolutional neural network for feature extraction
+- Transformer Model: Processes high-level features for improved pattern recognition 
+- MoE: Mixture of Experts model that combines multiple neural networks
+"""
+
+from NN.models.cnn_model import CNNModel
+from NN.models.transformer_model import TransformerModel, TransformerBlock, MixtureOfExpertsModel
+
+__all__ = ['CNNModel', 'TransformerModel', 'TransformerBlock', 'MixtureOfExpertsModel'] 
--- a/NN/models/transformer_model.py
+++ b/NN/models/transformer_model.py
@ -0,0 +1,553 @@
+import os
+import sys
+import numpy as np
+import pandas as pd
+import tensorflow as tf
+from tensorflow.keras.models import Model
+from tensorflow.keras.layers import (
+    Input, Dense, Dropout, LayerNormalization, MultiHeadAttention, 
+    GlobalAveragePooling1D, Concatenate, Add, Activation, Flatten
+)
+from tensorflow.keras.optimizers import Adam
+from tensorflow.keras.callbacks import (
+    EarlyStopping, ModelCheckpoint, ReduceLROnPlateau, 
+    TensorBoard, CSVLogger
+)
+import matplotlib.pyplot as plt
+import logging
+import time
+import datetime
+
+# Configure logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
+    handlers=[
+        logging.StreamHandler(),
+        logging.FileHandler('nn_transformer_model.log')
+    ]
+)
+
+logger = logging.getLogger('transformer_model')
+
+class TransformerBlock(tf.keras.layers.Layer):
+    """
+    Transformer block with multi-head self-attention and feed-forward network
+    """
+    def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
+        super(TransformerBlock, self).__init__()
+        self.att = MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)
+        self.ffn = tf.keras.Sequential([
+            Dense(ff_dim, activation="relu"),
+            Dense(embed_dim)
+        ])
+        self.layernorm1 = LayerNormalization(epsilon=1e-6)
+        self.layernorm2 = LayerNormalization(epsilon=1e-6)
+        self.dropout1 = Dropout(rate)
+        self.dropout2 = Dropout(rate)
+        
+    def call(self, inputs, training=False):
+        # Normalization and attention
+        attn_output = self.att(inputs, inputs)
+        attn_output = self.dropout1(attn_output, training=training)
+        out1 = self.layernorm1(inputs + attn_output)
+        
+        # Feed-forward network
+        ffn_output = self.ffn(out1)
+        ffn_output = self.dropout2(ffn_output, training=training)
+        
+        # Skip connection and normalization
+        return self.layernorm2(out1 + ffn_output)
+
+class TransformerModel:
+    """
+    Transformer-based model for financial time series analysis.
+    This model processes both raw time series data and high-level features from the CNN model.
+    """
+    
+    def __init__(self, ts_input_shape=(20, 5), feature_input_shape=128, output_size=3, model_dir='NN/models/saved'):
+        """
+        Initialize the Transformer model
+        
+        Args:
+            ts_input_shape: Shape of time series input data (sequence_length, features)
+            feature_input_shape: Shape of high-level feature input (from CNN)
+            output_size: Number of output classes or values
+            model_dir: Directory to save model files
+        """
+        self.ts_input_shape = ts_input_shape
+        self.feature_input_shape = feature_input_shape
+        self.output_size = output_size
+        self.model_dir = model_dir
+        self.model = None
+        self.history = None
+        
+        # Create model directory if it doesn't exist
+        os.makedirs(model_dir, exist_ok=True)
+        
+        logger.info(f"Initialized TransformerModel with time series input shape {ts_input_shape}, "
+                   f"feature input shape {feature_input_shape}, and output size {output_size}")
+    
+    def build_model(self, embed_dim=64, num_heads=4, ff_dim=128, num_transformer_blocks=2, 
+                   dropout_rate=0.2, learning_rate=0.001):
+        """
+        Build the Transformer model architecture
+        
+        Args:
+            embed_dim: Embedding dimension for the transformer
+            num_heads: Number of attention heads
+            ff_dim: Hidden layer size in the feed-forward network
+            num_transformer_blocks: Number of transformer blocks to stack
+            dropout_rate: Dropout rate for regularization
+            learning_rate: Learning rate for the optimizer
+            
+        Returns:
+            Compiled Keras model
+        """
+        # Time series input (price and volume data)
+        ts_inputs = Input(shape=self.ts_input_shape, name='time_series_input')
+        
+        # High-level feature input (from CNN or other sources)
+        feature_inputs = Input(shape=(self.feature_input_shape,), name='feature_input')
+        
+        # Process time series with transformer blocks
+        x = ts_inputs
+        for _ in range(num_transformer_blocks):
+            x = TransformerBlock(embed_dim, num_heads, ff_dim, dropout_rate)(x)
+        
+        # Global pooling to get fixed-size representation
+        x = GlobalAveragePooling1D()(x)
+        
+        # Combine with the high-level features
+        combined = Concatenate()([x, feature_inputs])
+        
+        # Dense layers
+        dense1 = Dense(128, activation='relu')(combined)
+        dropout1 = Dropout(dropout_rate)(dense1)
+        dense2 = Dense(64, activation='relu')(dropout1)
+        dropout2 = Dropout(dropout_rate)(dense2)
+        
+        # Output layer
+        if self.output_size == 1:
+            # Binary classification
+            outputs = Dense(1, activation='sigmoid')(dropout2)
+        elif self.output_size == 3:
+            # For BUY/HOLD/SELL signals (3 classes)
+            outputs = Dense(3, activation='softmax')(dropout2)
+        else:
+            # Regression or multi-class classification
+            outputs = Dense(self.output_size, activation='linear')(dropout2)
+        
+        # Create and compile the model
+        model = Model(inputs=[ts_inputs, feature_inputs], outputs=outputs)
+        
+        if self.output_size == 1:
+            # Binary classification
+            model.compile(
+                optimizer=Adam(learning_rate=learning_rate),
+                loss='binary_crossentropy',
+                metrics=['accuracy']
+            )
+        elif self.output_size == 3:
+            # Multi-class classification for BUY/HOLD/SELL
+            model.compile(
+                optimizer=Adam(learning_rate=learning_rate),
+                loss='categorical_crossentropy',
+                metrics=['accuracy']
+            )
+        else:
+            # Regression
+            model.compile(
+                optimizer=Adam(learning_rate=learning_rate),
+                loss='mse',
+                metrics=['mae']
+            )
+        
+        self.model = model
+        logger.info(f"Model built with {model.count_params()} parameters")
+        model.summary(print_fn=logger.info)
+        
+        return model
+    
+    def train(self, X_ts, X_features, y, batch_size=32, epochs=100, validation_split=0.2,
+             early_stopping_patience=20, reduce_lr_patience=10, verbose=1):
+        """
+        Train the Transformer model
+        
+        Args:
+            X_ts: Time series input data
+            X_features: High-level feature input data
+            y: Target values
+            batch_size: Batch size for training
+            epochs: Maximum number of epochs
+            validation_split: Fraction of data to use for validation
+            early_stopping_patience: Patience for early stopping
+            reduce_lr_patience: Patience for learning rate reduction
+            verbose: Verbosity level
+            
+        Returns:
+            Training history
+        """
+        if self.model is None:
+            logger.warning("Model not built yet, building with default parameters")
+            self.build_model()
+        
+        # Create a timestamp for this training run
+        timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
+        model_name = f"transformer_model_{timestamp}"
+        
+        # Set up callbacks
+        callbacks = [
+            # Early stopping to prevent overfitting
+            EarlyStopping(
+                monitor='val_loss', 
+                patience=early_stopping_patience,
+                restore_best_weights=True,
+                verbose=1
+            ),
+            
+            # Reduce learning rate when training plateaus
+            ReduceLROnPlateau(
+                monitor='val_loss', 
+                factor=0.5,
+                patience=reduce_lr_patience, 
+                min_lr=1e-6,
+                verbose=1
+            ),
+            
+            # Save the best model
+            ModelCheckpoint(
+                filepath=os.path.join(self.model_dir, f"{model_name}_best.h5"),
+                monitor='val_loss',
+                save_best_only=True,
+                verbose=1
+            ),
+            
+            # TensorBoard logging
+            TensorBoard(
+                log_dir=os.path.join(self.model_dir, 'logs', model_name),
+                histogram_freq=1
+            ),
+            
+            # CSV logging
+            CSVLogger(
+                filename=os.path.join(self.model_dir, f"{model_name}_training.csv"),
+                separator=',', 
+                append=False
+            )
+        ]
+        
+        # Train the model
+        logger.info(f"Starting training with {len(X_ts)} samples, {epochs} max epochs")
+        
+        start_time = time.time()
+        history = self.model.fit(
+            [X_ts, X_features], y,
+            batch_size=batch_size,
+            epochs=epochs,
+            validation_split=validation_split,
+            callbacks=callbacks,
+            verbose=verbose
+        )
+        
+        # Calculate training time
+        training_time = time.time() - start_time
+        logger.info(f"Training completed in {training_time:.2f} seconds")
+        
+        # Save the final model
+        self.model.save(os.path.join(self.model_dir, f"{model_name}_final.h5"))
+        logger.info(f"Model saved to {os.path.join(self.model_dir, model_name + '_final.h5')}")
+        
+        # Save training history
+        hist_df = pd.DataFrame(history.history)
+        hist_df.to_csv(os.path.join(self.model_dir, f"{model_name}_history.csv"), index=False)
+        
+        self.history = history
+        return history
+    
+    def predict(self, X_ts, X_features, threshold=0.5):
+        """
+        Make predictions with the model
+        
+        Args:
+            X_ts: Time series input data
+            X_features: High-level feature input data
+            threshold: Threshold for binary classification
+            
+        Returns:
+            Predicted values or classes
+        """
+        if self.model is None:
+            logger.error("Model not built or trained yet")
+            return None
+        
+        # Get raw predictions
+        y_pred_proba = self.model.predict([X_ts, X_features])
+        
+        # Format predictions based on output type
+        if self.output_size == 1:
+            # Binary classification
+            y_pred = (y_pred_proba > threshold).astype(int).flatten()
+            return y_pred, y_pred_proba.flatten()
+        elif self.output_size == 3:
+            # Multi-class (BUY/HOLD/SELL)
+            y_pred = np.argmax(y_pred_proba, axis=1)
+            return y_pred, y_pred_proba
+        else:
+            # Regression
+            return y_pred_proba
+    
+    def save_model(self, filepath=None):
+        """
+        Save the model to a file
+        
+        Args:
+            filepath: Path to save the model to
+            
+        Returns:
+            Path to the saved model
+        """
+        if self.model is None:
+            logger.error("Model not built or trained yet")
+            return None
+        
+        if filepath is None:
+            # Create a default filepath
+            timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
+            filepath = os.path.join(self.model_dir, f"transformer_model_{timestamp}.h5")
+        
+        self.model.save(filepath)
+        logger.info(f"Model saved to {filepath}")
+        
+        return filepath
+    
+    def load_model(self, filepath):
+        """
+        Load a model from a file
+        
+        Args:
+            filepath: Path to load the model from
+            
+        Returns:
+            Loaded model
+        """
+        try:
+            self.model = tf.keras.models.load_model(filepath)
+            logger.info(f"Model loaded from {filepath}")
+            return self.model
+        except Exception as e:
+            logger.error(f"Error loading model: {str(e)}")
+            return None
+
+class MixtureOfExpertsModel:
+    """
+    Mixture of Experts (MoE) model that combines predictions from multiple models.
+    This implementation focuses on combining CNN and Transformer models for financial analysis.
+    """
+    
+    def __init__(self, output_size=3, model_dir='NN/models/saved'):
+        """
+        Initialize the MoE model
+        
+        Args:
+            output_size: Number of output classes or values
+            model_dir: Directory to save model files
+        """
+        self.output_size = output_size
+        self.model_dir = model_dir
+        self.models = {}  # Dictionary to store expert models
+        self.gating_model = None  # Model to determine which expert to use
+        self.model = None  # Combined MoE model
+        
+        # Create model directory if it doesn't exist
+        os.makedirs(model_dir, exist_ok=True)
+        
+        logger.info(f"Initialized MixtureOfExpertsModel with output size {output_size}")
+    
+    def add_expert(self, name, model):
+        """
+        Add an expert model to the MoE
+        
+        Args:
+            name: Name of the expert
+            model: Expert model instance
+            
+        Returns:
+            None
+        """
+        self.models[name] = model
+        logger.info(f"Added expert model '{name}' to MoE")
+    
+    def build_model(self, ts_input_shape=(20, 5), expert_weights=None, learning_rate=0.001):
+        """
+        Build the MoE model architecture
+        
+        Args:
+            ts_input_shape: Shape of time series input data
+            expert_weights: Dictionary of expert weights (if None, equal weighting)
+            learning_rate: Learning rate for the optimizer
+            
+        Returns:
+            Compiled Keras model
+        """
+        if not self.models:
+            logger.error("No expert models added to MoE")
+            return None
+        
+        # Time series input
+        ts_inputs = Input(shape=ts_input_shape, name='time_series_input')
+        
+        # Get predictions from each expert
+        expert_outputs = []
+        expert_names = []
+        
+        for name, model in self.models.items():
+            if hasattr(model, 'predict') and callable(model.predict):
+                expert_names.append(name)
+                if name == 'cnn':
+                    # For CNN, we directly use the time series input
+                    # We need to extract the raw prediction function from the model's predict method
+                    # which typically returns both predictions and probabilities
+                    expert_outputs.append(model.model(ts_inputs))
+                elif name == 'transformer':
+                    # For transformer, we need features from the CNN as well
+                    # This is a simplification - in a real implementation, we would need to
+                    # extract features from the CNN model and pass them to the transformer
+                    # Here we just create dummy features
+                    dummy_features = Dense(128, activation='relu')(Flatten()(ts_inputs))
+                    expert_outputs.append(model.model([ts_inputs, dummy_features]))
+                else:
+                    logger.warning(f"Unknown model type: {name}, skipping")
+        
+        if not expert_outputs:
+            logger.error("No valid expert models found")
+            return None
+        
+        # Use expert weighting
+        if expert_weights is None:
+            # Equal weighting
+            weights = [1.0 / len(expert_outputs)] * len(expert_outputs)
+        else:
+            # User-provided weights
+            weights = [expert_weights.get(name, 1.0 / len(expert_outputs)) for name in expert_names]
+            # Normalize weights
+            weights = [w / sum(weights) for w in weights]
+        
+        # Combine expert outputs using weighted average
+        if len(expert_outputs) == 1:
+            # Only one expert, use its output directly
+            combined_output = expert_outputs[0]
+        else:
+            # Multiple experts, compute weighted average
+            weighted_outputs = [output * weight for output, weight in zip(expert_outputs, weights)]
+            combined_output = Add()(weighted_outputs)
+        
+        # Create the MoE model
+        moe_model = Model(inputs=ts_inputs, outputs=combined_output)
+        
+        # Compile the model
+        if self.output_size == 1:
+            # Binary classification
+            moe_model.compile(
+                optimizer=Adam(learning_rate=learning_rate),
+                loss='binary_crossentropy',
+                metrics=['accuracy']
+            )
+        elif self.output_size == 3:
+            # Multi-class classification for BUY/HOLD/SELL
+            moe_model.compile(
+                optimizer=Adam(learning_rate=learning_rate),
+                loss='categorical_crossentropy',
+                metrics=['accuracy']
+            )
+        else:
+            # Regression
+            moe_model.compile(
+                optimizer=Adam(learning_rate=learning_rate),
+                loss='mse',
+                metrics=['mae']
+            )
+        
+        self.model = moe_model
+        logger.info(f"MoE model built with experts: {expert_names}, weights: {weights}")
+        moe_model.summary(print_fn=logger.info)
+        
+        return moe_model
+    
+    def predict(self, X, threshold=0.5):
+        """
+        Make predictions with the MoE model
+        
+        Args:
+            X: Input data
+            threshold: Threshold for binary classification
+            
+        Returns:
+            Predicted values or classes
+        """
+        if self.model is None:
+            logger.error("MoE model not built yet")
+            return None
+        
+        # Get raw predictions
+        y_pred_proba = self.model.predict(X)
+        
+        # Format predictions based on output type
+        if self.output_size == 1:
+            # Binary classification
+            y_pred = (y_pred_proba > threshold).astype(int).flatten()
+            return y_pred, y_pred_proba.flatten()
+        elif self.output_size == 3:
+            # Multi-class (BUY/HOLD/SELL)
+            y_pred = np.argmax(y_pred_proba, axis=1)
+            return y_pred, y_pred_proba
+        else:
+            # Regression
+            return y_pred_proba
+    
+    def save_model(self, filepath=None):
+        """
+        Save the MoE model to a file
+        
+        Args:
+            filepath: Path to save the model to
+            
+        Returns:
+            Path to the saved model
+        """
+        if self.model is None:
+            logger.error("MoE model not built yet")
+            return None
+        
+        if filepath is None:
+            # Create a default filepath
+            timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
+            filepath = os.path.join(self.model_dir, f"moe_model_{timestamp}.h5")
+        
+        self.model.save(filepath)
+        logger.info(f"MoE model saved to {filepath}")
+        
+        return filepath
+    
+    def load_model(self, filepath):
+        """
+        Load an MoE model from a file
+        
+        Args:
+            filepath: Path to load the model from
+            
+        Returns:
+            Loaded model
+        """
+        try:
+            self.model = tf.keras.models.load_model(filepath)
+            logger.info(f"MoE model loaded from {filepath}")
+            return self.model
+        except Exception as e:
+            logger.error(f"Error loading MoE model: {str(e)}")
+            return None
+
+# Example usage:
+if __name__ == "__main__":
+    # This would be a complete implementation in a real system
+    print("Transformer and MoE models defined, but not implemented here.") 
--- a/NN/requirements.txt
+++ b/NN/requirements.txt
@ -0,0 +1,13 @@
+tensorflow>=2.5.0
+numpy>=1.19.5
+pandas>=1.3.0
+matplotlib>=3.4.2
+scikit-learn>=0.24.2
+tensorflow-addons>=0.13.0
+plotly>=5.1.0
+h5py>=3.1.0
+tqdm>=4.61.1
+pyyaml>=5.4.1
+tensorboard>=2.5.0
+ccxt>=1.50.0
+requests>=2.25.1 
--- a/NN/utils/init.py
+++ b/NN/utils/init.py
@ -0,0 +1,11 @@
+"""
+Neural Network Utilities
+======================
+
+This package contains utility functions and classes used in the neural network trading system:
+- Data Interface: Connects to realtime trading data and processes it for the neural network models
+"""
+
+from NN.utils.data_interface import DataInterface
+
+__all__ = ['DataInterface']