# Data Stream Monitor A comprehensive system for capturing and streaming all model input data in console-friendly text format, suitable for snapshots, training, and replay functionality. ## Overview The Data Stream Monitor captures real-time data flows through the trading system and outputs them in two formats: - **Detailed**: Human-readable format with clear sections - **Compact**: JSON format for programmatic processing ## Data Streams Captured ### Market Data - **OHLCV Data**: Multi-timeframe candlestick data (1m, 5m, 15m) - **Tick Data**: Real-time trade ticks with price, volume, and side - **COB Data**: Consolidated Order Book snapshots with imbalance and spread metrics ### Model Data - **Technical Indicators**: RSI, MACD, Bollinger Bands, etc. - **Model States**: Current state vectors for each model (DQN, CNN, RL) - **Predictions**: Recent predictions from all active models - **Training Experiences**: State-action-reward tuples from RL training ## Quick Start ### 1. Start the Dashboard ```bash source venv/bin/activate python run_clean_dashboard.py ``` ### 2. Start Data Streaming ```bash python data_stream_control.py start ``` ### 3. Control Streaming ```bash # Check status python data_stream_control.py status # Switch to compact format python data_stream_control.py compact # Save current snapshot python data_stream_control.py snapshot # Stop streaming python data_stream_control.py stop ``` ## Output Formats ### Detailed Format ``` ================================================================================ DATA STREAM SAMPLE - 14:30:15 ================================================================================ OHLCV (1m): ETH/USDT | O:4335.67 H:4338.92 L:4334.21 C:4336.67 V:125.8 TICK: ETH/USDT | Price:4336.67 Vol:0.0456 Side:buy COB: ETH/USDT | Imbalance:0.234 Spread:2.3bps Mid:4336.67 DQN State: 15 features | Price:4336.67 DQN Prediction: BUY (conf:0.78) Training Exp: Action:1 Reward:0.0234 Done:False ================================================================================ ``` ### Compact Format ```json {"timestamp":"2024-01-15T14:30:15","ohlcv_count":5,"ticks_count":12,"cob_count":8,"predictions_count":3,"experiences_count":7,"price":4336.67,"volume":125.8,"imbalance":0.234,"spread_bps":2.3} ``` ## Files ### Core Components - `data_stream_monitor.py` - Main streaming engine - `data_stream_control.py` - Command-line control interface - `demo_data_stream.py` - Usage examples and demo ### Integration Points - `run_clean_dashboard.py` - Auto-initializes streaming - `core/orchestrator.py` - Provides prediction data - `NN/training/enhanced_realtime_training.py` - Provides training data ## Configuration The streaming system is configurable via the `stream_config` dictionary: ```python stream_config = { 'console_output': True, # Enable/disable console output 'compact_format': False, # Use compact JSON format 'include_timestamps': True, # Include timestamps in output 'filter_symbols': ['ETH/USDT'], # Symbols to monitor 'sampling_rate': 1.0 # Sampling rate in seconds } ``` ## Use Cases ### Training Data Collection - Capture real market conditions during training - Build datasets for offline model validation - Replay specific market scenarios ### Debugging and Monitoring - Monitor model input data in real-time - Debug prediction inconsistencies - Validate data pipeline integrity ### Snapshot and Replay - Save complete system state for later analysis - Replay specific time periods - Compare model behavior across different market conditions ## Technical Details ### Data Collection - **Thread-safe**: Uses separate thread for data collection - **Memory-efficient**: Configurable buffer sizes with automatic cleanup - **Error-resilient**: Continues streaming even if individual data sources fail ### Integration - **Non-intrusive**: Doesn't affect main trading system performance - **Optional**: Can be disabled without affecting core functionality - **Extensible**: Easy to add new data streams ### Performance - **Low overhead**: Minimal CPU and memory usage - **Configurable sampling**: Adjust sampling rate based on needs - **Efficient storage**: Circular buffers prevent memory leaks ## Command Reference | Command | Description | |---------|-------------| | `start` | Start data streaming | | `stop` | Stop data streaming | | `status` | Show current status and buffer sizes | | `snapshot` | Save current data snapshot to file | | `compact` | Switch to compact JSON format | | `detailed` | Switch to detailed human-readable format | ## Troubleshooting ### Streaming Not Starting - Ensure dashboard is running first - Check that venv is activated - Verify data_stream_monitor.py is in project root ### No Data Output - Check streaming status with `python data_stream_control.py status` - Verify market data is available (check dashboard logs) - Ensure models are active and making predictions ### Performance Issues - Reduce sampling rate in stream_config - Switch to compact format for less output - Decrease buffer sizes if memory is limited ## Future Enhancements - **File output**: Save streaming data to rotating log files - **WebSocket output**: Stream data to external consumers - **Compression**: Automatic compression for long-term storage - **Filtering**: Advanced filtering based on market conditions - **Metrics**: Built-in performance metrics and statistics