new realt module

2025-03-25 12:48:58 +02:00
parent 9e81e86b1c
commit 114ced03b7
9 changed files with 1055 additions and 0 deletions
--- a/.vscode/launch.json
+++ b/.vscode/launch.json
@ -103,6 +103,51 @@
            "env": {
                "PYTHONUNBUFFERED": "1"
            }
        },
        {
            "name": "NN Training Pipeline",
            "type": "python",
            "request": "launch",
            "program": "-m",
            "args": [
                "NN.main",
                "--mode",
                "train",
                "--symbol",
                "BTC/USDT",
                "--timeframes",
                "1m", "5m", "1h", "4h",
                "--epochs",
                "100",
                "--batch_size",
                "64",
                "--window_size",
                "30",
                "--output_size",
                "3"
            ],
            "console": "integratedTerminal",
            "justMyCode": true,
            "env": {
                "PYTHONUNBUFFERED": "1",
                "TF_CPP_MIN_LOG_LEVEL": "2"
            },
            "postDebugTask": "Start TensorBoard"
        },
        {
            "name": "Realtime Charts with NN Inference",
            "type": "python",
            "request": "launch",
            "program": "realtime.py",
            "console": "integratedTerminal",
            "justMyCode": true,
            "env": {
                "PYTHONUNBUFFERED": "1",
                "ENABLE_NN_MODELS": "1",
                "NN_INFERENCE_INTERVAL": "60",
                "NN_MODEL_TYPE": "cnn",
                "NN_TIMEFRAME": "1h"
            }
        }
    ]
 }
--- a/NN/README.md
+++ b/NN/README.md
@ -0,0 +1,131 @@
 # Neural Network Trading System
 A comprehensive neural network trading system that uses deep learning models to analyze cryptocurrency price data and generate trading signals.
 ## Architecture Overview
 This project implements a 500M parameter neural network system using a Mixture of Experts (MoE) approach. The system consists of:
 1. **Data Interface**: Connects to real-time trading data from `realtime.py` and processes it for the neural network models
 2. **CNN Module (100M parameters)**: A deep convolutional neural network for feature extraction from time series data
 3. **Transformer Module**: Processes high-level features and raw data for improved pattern recognition
 4. **Mixture of Experts (MoE)**: Coordinates the different models and combines their predictions
 The system is designed to identify buy/sell opportunities in cryptocurrency markets by analyzing patterns in historical price and volume data.
 ## Components
 ### Data Interface
 - Located in `NN/utils/data_interface.py`
 - Provides seamless access to historical and real-time data from `realtime.py`
 - Preprocesses data for neural network consumption
 - Supports multiple timeframes and features
 ### CNN Model
 - Located in `NN/models/cnn_model.py`
 - Implements a deep convolutional network for time series analysis
 - Uses multiple parallel convolutional layers to detect patterns at different time scales
 - Includes bidirectional LSTM layers for sequence modeling
 - Optimized for financial time series data
 ### Transformer Model
 - Located in `NN/models/transformer_model.py`
 - Uses self-attention mechanism to process time series data
 - Takes both raw data and high-level features from the CNN as input
 - Better at capturing long-range dependencies in the data
 ### Orchestrator
 - Located in `NN/main.py`
 - Coordinates data flow between the models
 - Implements training and inference pipelines
 - Provides a unified interface for the entire system
 ## Usage
 ### Requirements
 - TensorFlow 2.x
 - NumPy
 - Pandas
 - Matplotlib
 - scikit-learn
 ### Training the Model
 To train the neural network on historical data:
 ```bash
 python -m NN.main --mode train --symbol BTC/USDT --timeframes 1h 4h 1d --epochs 100
 ```
 ### Making Predictions
 To make one-time predictions:
 ```bash
 python -m NN.main --mode predict --symbol BTC/USDT --timeframe 1h --model_type cnn
 ```
 ### Running Real-time Analysis
 To continuously analyze the market and generate signals:
 ```bash
 python -m NN.main --mode realtime --symbol BTC/USDT --timeframe 1h --interval 60
 ```
 ## Model Architecture Details
 ### CNN Architecture
 The CNN model uses a multi-scale approach with three parallel convolutional pathways:
 - Short-term patterns: 3x1 kernels
 - Medium-term patterns: 5x1 kernels
 - Long-term patterns: 7x1 kernels
 These pathways are merged and processed through deeper convolutional layers, followed by LSTM layers to capture temporal dependencies.
 ### Transformer Architecture
 The transformer model uses:
 - Multi-head self-attention layers to capture relationships between different time points
 - Layer normalization and residual connections for stable training
 - A feed-forward network for final classification/regression
 ### Mixture of Experts
 The MoE model:
 - Combines predictions from CNN and Transformer models
 - Uses a weighted average approach for signal generation
 - Can be extended with additional expert models
 ## Training Data
 The system uses historical OHLCV (Open, High, Low, Close, Volume) data at different timeframes:
 - 1-minute candles for short-term analysis
 - 1-hour candles for medium-term trends
 - 1-day candles for long-term market direction
 ## Output
 The system generates one of three signals:
 - BUY: Indicates a potential buying opportunity
 - HOLD: Suggests maintaining current position
 - SELL: Indicates a potential selling opportunity
 ## Development
 ### Adding New Models
 To add a new model type:
 1. Create a new class in the `NN/models` directory
 2. Implement the required interface (build_model, train, predict, etc.)
 3. Update the orchestrator to include the new model
 ### Customizing Parameters
 Key parameters can be customized through command-line arguments or by modifying the configuration in `main.py`. 
--- a/NN/init.py
+++ b/NN/init.py
@ -0,0 +1,16 @@
 """
 Neural Network Trading System
 ============================
 A comprehensive neural network trading system that uses deep learning models
 to analyze cryptocurrency price data and generate trading signals.
 The system consists of:
 1. Data Interface: Connects to realtime trading data
 2. CNN Model: Deep convolutional neural network for feature extraction
 3. Transformer Model: Processes high-level features for improved pattern recognition
 4. MoE: Mixture of Experts model that combines multiple neural networks
 """
 __version__ = '0.1.0'
 __author__ = 'Gogo2 Project' 
--- a/NN/data/init.py
+++ b/NN/data/init.py
@ -0,0 +1,11 @@
 """
 Neural Network Data
 =================
 This package is used to store datasets and model outputs.
 It does not contain any code, but serves as a storage location for:
 - Training datasets
 - Evaluation results
 - Inference outputs
 - Model checkpoints
 """ 
--- a/NN/example.py
+++ b/NN/example.py
@ -0,0 +1,261 @@
 #!/usr/bin/env python
 """
 Example script for the Neural Network Trading System
 This shows basic usage patterns for the system components
 """
 import os
 import sys
 import numpy as np
 import pandas as pd
 import tensorflow as tf
 import matplotlib.pyplot as plt
 from datetime import datetime
 import logging
 # Add project root to path
 sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 # Import components
 from NN.utils.data_interface import DataInterface
 from NN.models.cnn_model import CNNModel
 from NN.models.transformer_model import TransformerModel, MixtureOfExpertsModel
 from NN.main import NeuralNetworkOrchestrator
 # Configure logging
 logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
 )
 logger = logging.getLogger('example')
 def example_data_interface():
    """Show how to use the data interface"""
    logger.info("=== Data Interface Example ===")
    # Initialize data interface
    di = DataInterface(symbol="BTC/USDT", timeframes=['1h', '4h', '1d'])
    # Get historical data
    df_1h = di.get_historical_data(timeframe='1h', n_candles=100)
    if df_1h is not None and not df_1h.empty:
        logger.info(f"Retrieved {len(df_1h)} 1-hour candles")
        logger.info(f"Most recent candle: {df_1h.iloc[-1]}")
    # Prepare data for neural network
    X, y, timestamps = di.prepare_nn_input(timeframes=['1h'], n_candles=500, window_size=20)
    if X is not None and y is not None:
        logger.info(f"Prepared input shape: {X.shape}, target shape: {y.shape}")
    # Generate a dataset
    dataset = di.generate_training_dataset(
        timeframes=['1h', '4h'],
        n_candles=1000,
        window_size=20
    )
    if dataset:
        logger.info(f"Dataset generated and saved to: {list(dataset.values())}")
    return X, y, timestamps if X is not None else (None, None, None)
 def example_cnn_model(X=None, y=None):
    """Show how to use the CNN model"""
    logger.info("=== CNN Model Example ===")
    # If no data provided, create dummy data
    if X is None or y is None:
        logger.info("Creating dummy data for CNN example")
        X = np.random.random((1000, 20, 5))  # 1000 samples, 20 time steps, 5 features
        y = np.random.randint(0, 2, size=(1000,))  # Binary labels
    # Split data into training and testing sets
    from sklearn.model_selection import train_test_split
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    # Initialize and build the CNN model
    cnn = CNNModel(input_shape=(20, 5), output_size=1, model_dir='NN/models/saved')
    cnn.build_model(filters=(32, 64, 128), kernel_sizes=(3, 5, 7), dropout_rate=0.3)
    # Train the model (very small number of epochs for this example)
    history = cnn.train(
        X_train, y_train,
        batch_size=32,
        epochs=5,  # Just a few epochs for the example
        validation_split=0.2
    )
    # Evaluate the model
    metrics = cnn.evaluate(X_test, y_test, plot_results=True)
    if metrics:
        logger.info(f"CNN Evaluation metrics: {metrics}")
    # Make a prediction
    y_pred, y_proba = cnn.predict(X_test[:1])
    logger.info(f"CNN Prediction: {y_pred[0]}, Probability: {y_proba[0]:.4f}")
    return cnn
 def example_transformer_model(X=None, y=None, cnn_model=None):
    """Show how to use the Transformer model"""
    logger.info("=== Transformer Model Example ===")
    # If no data provided, create dummy data
    if X is None or y is None:
        logger.info("Creating dummy data for Transformer example")
        X = np.random.random((1000, 20, 5))  # 1000 samples, 20 time steps, 5 features
        y = np.random.randint(0, 2, size=(1000,))  # Binary labels
    # Generate high-level features (from CNN model or random if no CNN provided)
    if cnn_model is not None and hasattr(cnn_model, 'extract_hidden_features'):
        # Extract features from CNN model
        X_features = cnn_model.extract_hidden_features(X)
        logger.info(f"Extracted {X_features.shape[1]} features from CNN model")
    else:
        # Generate random features
        X_features = np.random.random((len(X), 128))
        logger.info("Generated random features for Transformer model")
    # Split data into training and testing sets
    from sklearn.model_selection import train_test_split
    X_train, X_test, X_feat_train, X_feat_test, y_train, y_test = train_test_split(
        X, X_features, y, test_size=0.2, random_state=42
    )
    # Initialize and build the Transformer model
    transformer = TransformerModel(
        ts_input_shape=(20, 5),
        feature_input_shape=X_features.shape[1],
        output_size=1,
        model_dir='NN/models/saved'
    )
    transformer.build_model(
        embed_dim=32,
        num_heads=2,
        ff_dim=64,
        num_transformer_blocks=2,
        dropout_rate=0.2
    )
    # Train the model (very small number of epochs for this example)
    history = transformer.train(
        X_train, X_feat_train, y_train,
        batch_size=32,
        epochs=5,  # Just a few epochs for the example
        validation_split=0.2
    )
    # Make a prediction
    y_pred, y_proba = transformer.predict(X_test[:1], X_feat_test[:1])
    logger.info(f"Transformer Prediction: {y_pred[0]}, Probability: {y_proba[0]:.4f}")
    return transformer
 def example_moe_model(X=None, y=None, cnn_model=None, transformer_model=None):
    """Show how to use the Mixture of Experts model"""
    logger.info("=== Mixture of Experts Example ===")
    # If no data provided, create dummy data
    if X is None or y is None:
        logger.info("Creating dummy data for MoE example")
        X = np.random.random((1000, 20, 5))  # 1000 samples, 20 time steps, 5 features
        y = np.random.randint(0, 2, size=(1000,))  # Binary labels
    # If models not provided, create them
    if cnn_model is None:
        logger.info("Creating a new CNN model for MoE")
        cnn_model = CNNModel(input_shape=(20, 5), output_size=1)
        cnn_model.build_model()
    if transformer_model is None:
        logger.info("Creating a new Transformer model for MoE")
        transformer_model = TransformerModel(ts_input_shape=(20, 5), feature_input_shape=128, output_size=1)
        transformer_model.build_model()
    # Initialize MoE model
    moe = MixtureOfExpertsModel(output_size=1, model_dir='NN/models/saved')
    # Add expert models
    moe.add_expert('cnn', cnn_model)
    moe.add_expert('transformer', transformer_model)
    # Build the MoE model (this is a simplified implementation - in a real scenario
    # you would need to handle the interfaces between models more carefully)
    moe.build_model(
        ts_input_shape=(20, 5),
        expert_weights={'cnn': 0.7, 'transformer': 0.3}
    )
    # In a real implementation, you would train the MoE model here
    logger.info("MoE model built - in a real implementation, you would train it here")
    return moe
 def example_orchestrator():
    """Show how to use the Orchestrator"""
    logger.info("=== Orchestrator Example ===")
    # Configure the orchestrator
    config = {
        'symbol': 'BTC/USDT',
        'timeframes': ['1h', '4h'],
        'window_size': 20,
        'n_features': 5,
        'output_size': 3,  # BUY/HOLD/SELL
        'batch_size': 32,
        'epochs': 5,  # Small number for example
        'model_dir': 'NN/models/saved',
        'data_dir': 'NN/data'
    }
    # Initialize the orchestrator
    orchestrator = NeuralNetworkOrchestrator(config)
    # Prepare training data
    X, y, timestamps = orchestrator.prepare_training_data(
        timeframes=['1h'],
        n_candles=200
    )
    if X is not None and y is not None:
        logger.info(f"Prepared training data: X shape {X.shape}, y shape {y.shape}")
        # Train CNN model
        logger.info("Training CNN model with orchestrator...")
        history = orchestrator.train_cnn_model(X, y, epochs=2)  # Very small for example
        # Make a prediction
        result = orchestrator.run_inference_pipeline(
            model_type='cnn',
            timeframe='1h'
        )
        if result:
            logger.info(f"Inference result: {result}")
    else:
        logger.warning("Could not prepare training data - this is expected if no real data is available")
        logger.info("The orchestrator would normally handle training and inference")
 def main():
    """Run all examples"""
    logger.info("Starting Neural Network Trading System Examples")
    # Example 1: Data Interface
    X, y, timestamps = example_data_interface()
    # Example 2: CNN Model
    cnn_model = example_cnn_model(X, y)
    # Example 3: Transformer Model
    transformer_model = example_transformer_model(X, y, cnn_model)
    # Example 4: Mixture of Experts
    moe_model = example_moe_model(X, y, cnn_model, transformer_model)
    # Example 5: Orchestrator
    example_orchestrator()
    logger.info("Examples completed")
 if __name__ == "__main__":
    main() 
--- a/NN/models/init.py
+++ b/NN/models/init.py
@ -0,0 +1,14 @@
 """
 Neural Network Models
 ====================
 This package contains the neural network models used in the trading system:
 - CNN Model: Deep convolutional neural network for feature extraction
 - Transformer Model: Processes high-level features for improved pattern recognition 
 - MoE: Mixture of Experts model that combines multiple neural networks
 """
 from NN.models.cnn_model import CNNModel
 from NN.models.transformer_model import TransformerModel, TransformerBlock, MixtureOfExpertsModel
 __all__ = ['CNNModel', 'TransformerModel', 'TransformerBlock', 'MixtureOfExpertsModel'] 
--- a/NN/models/transformer_model.py
+++ b/NN/models/transformer_model.py
@ -0,0 +1,553 @@
 import os
 import sys
 import numpy as np
 import pandas as pd
 import tensorflow as tf
 from tensorflow.keras.models import Model
 from tensorflow.keras.layers import (
    Input, Dense, Dropout, LayerNormalization, MultiHeadAttention, 
    GlobalAveragePooling1D, Concatenate, Add, Activation, Flatten
 )
 from tensorflow.keras.optimizers import Adam
 from tensorflow.keras.callbacks import (
    EarlyStopping, ModelCheckpoint, ReduceLROnPlateau, 
    TensorBoard, CSVLogger
 )
 import matplotlib.pyplot as plt
 import logging
 import time
 import datetime
 # Configure logging
 logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.StreamHandler(),
        logging.FileHandler('nn_transformer_model.log')
    ]
 )
 logger = logging.getLogger('transformer_model')
 class TransformerBlock(tf.keras.layers.Layer):
    """
    Transformer block with multi-head self-attention and feed-forward network
    """
    def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
        super(TransformerBlock, self).__init__()
        self.att = MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)
        self.ffn = tf.keras.Sequential([
            Dense(ff_dim, activation="relu"),
            Dense(embed_dim)
        ])
        self.layernorm1 = LayerNormalization(epsilon=1e-6)
        self.layernorm2 = LayerNormalization(epsilon=1e-6)
        self.dropout1 = Dropout(rate)
        self.dropout2 = Dropout(rate)
    def call(self, inputs, training=False):
        # Normalization and attention
        attn_output = self.att(inputs, inputs)
        attn_output = self.dropout1(attn_output, training=training)
        out1 = self.layernorm1(inputs + attn_output)
        # Feed-forward network
        ffn_output = self.ffn(out1)
        ffn_output = self.dropout2(ffn_output, training=training)
        # Skip connection and normalization
        return self.layernorm2(out1 + ffn_output)
 class TransformerModel:
    """
    Transformer-based model for financial time series analysis.
    This model processes both raw time series data and high-level features from the CNN model.
    """
    def __init__(self, ts_input_shape=(20, 5), feature_input_shape=128, output_size=3, model_dir='NN/models/saved'):
        """
        Initialize the Transformer model
        Args:
            ts_input_shape: Shape of time series input data (sequence_length, features)
            feature_input_shape: Shape of high-level feature input (from CNN)
            output_size: Number of output classes or values
            model_dir: Directory to save model files
        """
        self.ts_input_shape = ts_input_shape
        self.feature_input_shape = feature_input_shape
        self.output_size = output_size
        self.model_dir = model_dir
        self.model = None
        self.history = None
        # Create model directory if it doesn't exist
        os.makedirs(model_dir, exist_ok=True)
        logger.info(f"Initialized TransformerModel with time series input shape {ts_input_shape}, "
                   f"feature input shape {feature_input_shape}, and output size {output_size}")
    def build_model(self, embed_dim=64, num_heads=4, ff_dim=128, num_transformer_blocks=2, 
                   dropout_rate=0.2, learning_rate=0.001):
        """
        Build the Transformer model architecture
        Args:
            embed_dim: Embedding dimension for the transformer
            num_heads: Number of attention heads
            ff_dim: Hidden layer size in the feed-forward network
            num_transformer_blocks: Number of transformer blocks to stack
            dropout_rate: Dropout rate for regularization
            learning_rate: Learning rate for the optimizer
        Returns:
            Compiled Keras model
        """
        # Time series input (price and volume data)
        ts_inputs = Input(shape=self.ts_input_shape, name='time_series_input')
        # High-level feature input (from CNN or other sources)
        feature_inputs = Input(shape=(self.feature_input_shape,), name='feature_input')
        # Process time series with transformer blocks
        x = ts_inputs
        for _ in range(num_transformer_blocks):
            x = TransformerBlock(embed_dim, num_heads, ff_dim, dropout_rate)(x)
        # Global pooling to get fixed-size representation
        x = GlobalAveragePooling1D()(x)
        # Combine with the high-level features
        combined = Concatenate()([x, feature_inputs])
        # Dense layers
        dense1 = Dense(128, activation='relu')(combined)
        dropout1 = Dropout(dropout_rate)(dense1)
        dense2 = Dense(64, activation='relu')(dropout1)
        dropout2 = Dropout(dropout_rate)(dense2)
        # Output layer
        if self.output_size == 1:
            # Binary classification
            outputs = Dense(1, activation='sigmoid')(dropout2)
        elif self.output_size == 3:
            # For BUY/HOLD/SELL signals (3 classes)
            outputs = Dense(3, activation='softmax')(dropout2)
        else:
            # Regression or multi-class classification
            outputs = Dense(self.output_size, activation='linear')(dropout2)
        # Create and compile the model
        model = Model(inputs=[ts_inputs, feature_inputs], outputs=outputs)
        if self.output_size == 1:
            # Binary classification
            model.compile(
                optimizer=Adam(learning_rate=learning_rate),
                loss='binary_crossentropy',
                metrics=['accuracy']
            )
        elif self.output_size == 3:
            # Multi-class classification for BUY/HOLD/SELL
            model.compile(
                optimizer=Adam(learning_rate=learning_rate),
                loss='categorical_crossentropy',
                metrics=['accuracy']
            )
        else:
            # Regression
            model.compile(
                optimizer=Adam(learning_rate=learning_rate),
                loss='mse',
                metrics=['mae']
            )
        self.model = model
        logger.info(f"Model built with {model.count_params()} parameters")
        model.summary(print_fn=logger.info)
        return model
    def train(self, X_ts, X_features, y, batch_size=32, epochs=100, validation_split=0.2,
             early_stopping_patience=20, reduce_lr_patience=10, verbose=1):
        """
        Train the Transformer model
        Args:
            X_ts: Time series input data
            X_features: High-level feature input data
            y: Target values
            batch_size: Batch size for training
            epochs: Maximum number of epochs
            validation_split: Fraction of data to use for validation
            early_stopping_patience: Patience for early stopping
            reduce_lr_patience: Patience for learning rate reduction
            verbose: Verbosity level
        Returns:
            Training history
        """
        if self.model is None:
            logger.warning("Model not built yet, building with default parameters")
            self.build_model()
        # Create a timestamp for this training run
        timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
        model_name = f"transformer_model_{timestamp}"
        # Set up callbacks
        callbacks = [
            # Early stopping to prevent overfitting
            EarlyStopping(
                monitor='val_loss', 
                patience=early_stopping_patience,
                restore_best_weights=True,
                verbose=1
            ),
            # Reduce learning rate when training plateaus
            ReduceLROnPlateau(
                monitor='val_loss', 
                factor=0.5,
                patience=reduce_lr_patience, 
                min_lr=1e-6,
                verbose=1
            ),
            # Save the best model
            ModelCheckpoint(
                filepath=os.path.join(self.model_dir, f"{model_name}_best.h5"),
                monitor='val_loss',
                save_best_only=True,
                verbose=1
            ),
            # TensorBoard logging
            TensorBoard(
                log_dir=os.path.join(self.model_dir, 'logs', model_name),
                histogram_freq=1
            ),
            # CSV logging
            CSVLogger(
                filename=os.path.join(self.model_dir, f"{model_name}_training.csv"),
                separator=',', 
                append=False
            )
        ]
        # Train the model
        logger.info(f"Starting training with {len(X_ts)} samples, {epochs} max epochs")
        start_time = time.time()
        history = self.model.fit(
            [X_ts, X_features], y,
            batch_size=batch_size,
            epochs=epochs,
            validation_split=validation_split,
            callbacks=callbacks,
            verbose=verbose
        )
        # Calculate training time
        training_time = time.time() - start_time
        logger.info(f"Training completed in {training_time:.2f} seconds")
        # Save the final model
        self.model.save(os.path.join(self.model_dir, f"{model_name}_final.h5"))
        logger.info(f"Model saved to {os.path.join(self.model_dir, model_name + '_final.h5')}")
        # Save training history
        hist_df = pd.DataFrame(history.history)
        hist_df.to_csv(os.path.join(self.model_dir, f"{model_name}_history.csv"), index=False)
        self.history = history
        return history
    def predict(self, X_ts, X_features, threshold=0.5):
        """
        Make predictions with the model
        Args:
            X_ts: Time series input data
            X_features: High-level feature input data
            threshold: Threshold for binary classification
        Returns:
            Predicted values or classes
        """
        if self.model is None:
            logger.error("Model not built or trained yet")
            return None
        # Get raw predictions
        y_pred_proba = self.model.predict([X_ts, X_features])
        # Format predictions based on output type
        if self.output_size == 1:
            # Binary classification
            y_pred = (y_pred_proba > threshold).astype(int).flatten()
            return y_pred, y_pred_proba.flatten()
        elif self.output_size == 3:
            # Multi-class (BUY/HOLD/SELL)
            y_pred = np.argmax(y_pred_proba, axis=1)
            return y_pred, y_pred_proba
        else:
            # Regression
            return y_pred_proba
    def save_model(self, filepath=None):
        """
        Save the model to a file
        Args:
            filepath: Path to save the model to
        Returns:
            Path to the saved model
        """
        if self.model is None:
            logger.error("Model not built or trained yet")
            return None
        if filepath is None:
            # Create a default filepath
            timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
            filepath = os.path.join(self.model_dir, f"transformer_model_{timestamp}.h5")
        self.model.save(filepath)
        logger.info(f"Model saved to {filepath}")
        return filepath
    def load_model(self, filepath):
        """
        Load a model from a file
        Args:
            filepath: Path to load the model from
        Returns:
            Loaded model
        """
        try:
            self.model = tf.keras.models.load_model(filepath)
            logger.info(f"Model loaded from {filepath}")
            return self.model
        except Exception as e:
            logger.error(f"Error loading model: {str(e)}")
            return None
 class MixtureOfExpertsModel:
    """
    Mixture of Experts (MoE) model that combines predictions from multiple models.
    This implementation focuses on combining CNN and Transformer models for financial analysis.
    """
    def __init__(self, output_size=3, model_dir='NN/models/saved'):
        """
        Initialize the MoE model
        Args:
            output_size: Number of output classes or values
            model_dir: Directory to save model files
        """
        self.output_size = output_size
        self.model_dir = model_dir
        self.models = {}  # Dictionary to store expert models
        self.gating_model = None  # Model to determine which expert to use
        self.model = None  # Combined MoE model
        # Create model directory if it doesn't exist
        os.makedirs(model_dir, exist_ok=True)
        logger.info(f"Initialized MixtureOfExpertsModel with output size {output_size}")
    def add_expert(self, name, model):
        """
        Add an expert model to the MoE
        Args:
            name: Name of the expert
            model: Expert model instance
        Returns:
            None
        """
        self.models[name] = model
        logger.info(f"Added expert model '{name}' to MoE")
    def build_model(self, ts_input_shape=(20, 5), expert_weights=None, learning_rate=0.001):
        """
        Build the MoE model architecture
        Args:
            ts_input_shape: Shape of time series input data
            expert_weights: Dictionary of expert weights (if None, equal weighting)
            learning_rate: Learning rate for the optimizer
        Returns:
            Compiled Keras model
        """
        if not self.models:
            logger.error("No expert models added to MoE")
            return None
        # Time series input
        ts_inputs = Input(shape=ts_input_shape, name='time_series_input')
        # Get predictions from each expert
        expert_outputs = []
        expert_names = []
        for name, model in self.models.items():
            if hasattr(model, 'predict') and callable(model.predict):
                expert_names.append(name)
                if name == 'cnn':
                    # For CNN, we directly use the time series input
                    # We need to extract the raw prediction function from the model's predict method
                    # which typically returns both predictions and probabilities
                    expert_outputs.append(model.model(ts_inputs))
                elif name == 'transformer':
                    # For transformer, we need features from the CNN as well
                    # This is a simplification - in a real implementation, we would need to
                    # extract features from the CNN model and pass them to the transformer
                    # Here we just create dummy features
                    dummy_features = Dense(128, activation='relu')(Flatten()(ts_inputs))
                    expert_outputs.append(model.model([ts_inputs, dummy_features]))
                else:
                    logger.warning(f"Unknown model type: {name}, skipping")
        if not expert_outputs:
            logger.error("No valid expert models found")
            return None
        # Use expert weighting
        if expert_weights is None:
            # Equal weighting
            weights = [1.0 / len(expert_outputs)] * len(expert_outputs)
        else:
            # User-provided weights
            weights = [expert_weights.get(name, 1.0 / len(expert_outputs)) for name in expert_names]
            # Normalize weights
            weights = [w / sum(weights) for w in weights]
        # Combine expert outputs using weighted average
        if len(expert_outputs) == 1:
            # Only one expert, use its output directly
            combined_output = expert_outputs[0]
        else:
            # Multiple experts, compute weighted average
            weighted_outputs = [output * weight for output, weight in zip(expert_outputs, weights)]
            combined_output = Add()(weighted_outputs)
        # Create the MoE model
        moe_model = Model(inputs=ts_inputs, outputs=combined_output)
        # Compile the model
        if self.output_size == 1:
            # Binary classification
            moe_model.compile(
                optimizer=Adam(learning_rate=learning_rate),
                loss='binary_crossentropy',
                metrics=['accuracy']
            )
        elif self.output_size == 3:
            # Multi-class classification for BUY/HOLD/SELL
            moe_model.compile(
                optimizer=Adam(learning_rate=learning_rate),
                loss='categorical_crossentropy',
                metrics=['accuracy']
            )
        else:
            # Regression
            moe_model.compile(
                optimizer=Adam(learning_rate=learning_rate),
                loss='mse',
                metrics=['mae']
            )
        self.model = moe_model
        logger.info(f"MoE model built with experts: {expert_names}, weights: {weights}")
        moe_model.summary(print_fn=logger.info)
        return moe_model
    def predict(self, X, threshold=0.5):
        """
        Make predictions with the MoE model
        Args:
            X: Input data
            threshold: Threshold for binary classification
        Returns:
            Predicted values or classes
        """
        if self.model is None:
            logger.error("MoE model not built yet")
            return None
        # Get raw predictions
        y_pred_proba = self.model.predict(X)
        # Format predictions based on output type
        if self.output_size == 1:
            # Binary classification
            y_pred = (y_pred_proba > threshold).astype(int).flatten()
            return y_pred, y_pred_proba.flatten()
        elif self.output_size == 3:
            # Multi-class (BUY/HOLD/SELL)
            y_pred = np.argmax(y_pred_proba, axis=1)
            return y_pred, y_pred_proba
        else:
            # Regression
            return y_pred_proba
    def save_model(self, filepath=None):
        """
        Save the MoE model to a file
        Args:
            filepath: Path to save the model to
        Returns:
            Path to the saved model
        """
        if self.model is None:
            logger.error("MoE model not built yet")
            return None
        if filepath is None:
            # Create a default filepath
            timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
            filepath = os.path.join(self.model_dir, f"moe_model_{timestamp}.h5")
        self.model.save(filepath)
        logger.info(f"MoE model saved to {filepath}")
        return filepath
    def load_model(self, filepath):
        """
        Load an MoE model from a file
        Args:
            filepath: Path to load the model from
        Returns:
            Loaded model
        """
        try:
            self.model = tf.keras.models.load_model(filepath)
            logger.info(f"MoE model loaded from {filepath}")
            return self.model
        except Exception as e:
            logger.error(f"Error loading MoE model: {str(e)}")
            return None
 # Example usage:
 if __name__ == "__main__":
    # This would be a complete implementation in a real system
    print("Transformer and MoE models defined, but not implemented here.") 
--- a/NN/requirements.txt
+++ b/NN/requirements.txt
@ -0,0 +1,13 @@
 tensorflow>=2.5.0
 numpy>=1.19.5
 pandas>=1.3.0
 matplotlib>=3.4.2
 scikit-learn>=0.24.2
 tensorflow-addons>=0.13.0
 plotly>=5.1.0
 h5py>=3.1.0
 tqdm>=4.61.1
 pyyaml>=5.4.1
 tensorboard>=2.5.0
 ccxt>=1.50.0
 requests>=2.25.1 
--- a/NN/utils/init.py
+++ b/NN/utils/init.py
@ -0,0 +1,11 @@
 """
 Neural Network Utilities
 ======================
 This package contains utility functions and classes used in the neural network trading system:
 - Data Interface: Connects to realtime trading data and processes it for the neural network models
 """
 from NN.utils.data_interface import DataInterface
 __all__ = ['DataInterface']