# Market Data Infrastructure Docker Setup This directory contains Docker Compose configurations and scripts for deploying TimescaleDB and Redis infrastructure for the multi-exchange data aggregation system. ## 🏗️ Architecture - **TimescaleDB**: Time-series database optimized for high-frequency market data - **Redis**: High-performance caching layer for real-time data - **Network**: Isolated Docker network for secure communication ## 📋 Prerequisites - Docker Engine 20.10+ - Docker Compose 2.0+ - At least 4GB RAM available for containers - 50GB+ disk space for data storage ## 🚀 Quick Start 1. **Copy environment file**: ```bash cp .env.example .env ``` 2. **Edit configuration** (update passwords and settings): ```bash nano .env ``` 3. **Deploy infrastructure**: ```bash chmod +x deploy.sh ./deploy.sh ``` 4. **Verify deployment**: ```bash docker-compose -f timescaledb-compose.yml ps ``` ## 📁 File Structure ``` docker/ ├── timescaledb-compose.yml # Main Docker Compose configuration ├── init-scripts/ # Database initialization scripts │ └── 01-init-timescaledb.sql ├── redis.conf # Redis configuration ├── .env # Environment variables ├── deploy.sh # Deployment script ├── backup.sh # Backup script ├── restore.sh # Restore script └── README.md # This file ``` ## ⚙️ Configuration ### Environment Variables Key variables in `.env`: ```bash # Database credentials POSTGRES_PASSWORD=your_secure_password POSTGRES_USER=market_user POSTGRES_DB=market_data # Redis settings REDIS_PASSWORD=your_redis_password # Performance tuning POSTGRES_SHARED_BUFFERS=256MB POSTGRES_EFFECTIVE_CACHE_SIZE=1GB REDIS_MAXMEMORY=2gb ``` ### TimescaleDB Configuration The database is pre-configured with: - Optimized PostgreSQL settings for time-series data - TimescaleDB extension enabled - Hypertables for automatic partitioning - Retention policies (90 days for raw data) - Continuous aggregates for common queries - Proper indexes for query performance ### Redis Configuration Redis is configured for: - High-frequency data caching - Memory optimization (2GB limit) - Persistence with AOF and RDB - Optimized for order book data structures ## 🔌 Connection Details After deployment, connect using: ### TimescaleDB ``` Host: 192.168.0.10 Port: 5432 Database: market_data Username: market_user Password: (from .env file) ``` ### Redis ``` Host: 192.168.0.10 Port: 6379 Password: (from .env file) ``` ## 🗄️ Database Schema The system creates the following tables: - `order_book_snapshots`: Real-time order book data - `trade_events`: Individual trade events - `heatmap_data`: Aggregated price bucket data - `ohlcv_data`: OHLCV candlestick data - `exchange_status`: Exchange connection monitoring - `system_metrics`: System performance metrics ## 💾 Backup & Restore ### Create Backup ```bash chmod +x backup.sh ./backup.sh ``` Backups are stored in `./backups/` with timestamp. ### Restore from Backup ```bash chmod +x restore.sh ./restore.sh market_data_backup_YYYYMMDD_HHMMSS.tar.gz ``` ### Automated Backups Set up a cron job for regular backups: ```bash # Daily backup at 2 AM 0 2 * * * /path/to/docker/backup.sh ``` ## 📊 Monitoring ### Health Checks Check service health: ```bash # TimescaleDB docker exec market_data_timescaledb pg_isready -U market_user -d market_data # Redis docker exec market_data_redis redis-cli -a your_password ping ``` ### View Logs ```bash # All services docker-compose -f timescaledb-compose.yml logs -f # Specific service docker-compose -f timescaledb-compose.yml logs -f timescaledb ``` ### Database Queries Connect to TimescaleDB: ```bash docker exec -it market_data_timescaledb psql -U market_user -d market_data ``` Example queries: ```sql -- Check table sizes SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size FROM pg_tables WHERE schemaname = 'market_data'; -- Recent order book data SELECT * FROM market_data.order_book_snapshots ORDER BY timestamp DESC LIMIT 10; -- Exchange status SELECT * FROM market_data.exchange_status ORDER BY timestamp DESC LIMIT 10; ``` ## 🔧 Maintenance ### Update Images ```bash docker-compose -f timescaledb-compose.yml pull docker-compose -f timescaledb-compose.yml up -d ``` ### Clean Up Old Data ```bash # TimescaleDB has automatic retention policies # Manual cleanup if needed: docker exec market_data_timescaledb psql -U market_user -d market_data -c " SELECT drop_chunks('market_data.order_book_snapshots', INTERVAL '30 days'); " ``` ### Scale Resources Edit `timescaledb-compose.yml` to adjust: - Memory limits - CPU limits - Shared buffers - Connection limits ## 🚨 Troubleshooting ### Common Issues 1. **Port conflicts**: Change ports in compose file if 5432/6379 are in use 2. **Memory issues**: Reduce shared_buffers and Redis maxmemory 3. **Disk space**: Monitor `/var/lib/docker/volumes/` usage 4. **Connection refused**: Check firewall settings and container status ### Performance Tuning 1. **TimescaleDB**: - Adjust `shared_buffers` based on available RAM - Tune `effective_cache_size` to 75% of system RAM - Monitor query performance with `pg_stat_statements` 2. **Redis**: - Adjust `maxmemory` based on data volume - Monitor memory usage with `INFO memory` - Use appropriate eviction policy ### Recovery Procedures 1. **Container failure**: `docker-compose restart ` 2. **Data corruption**: Restore from latest backup 3. **Network issues**: Check Docker network configuration 4. **Performance degradation**: Review logs and system metrics ## 🔐 Security - Change default passwords in `.env` - Use strong passwords (20+ characters) - Restrict network access to trusted IPs - Regular security updates - Monitor access logs - Enable SSL/TLS for production ## 📞 Support For issues related to: - TimescaleDB: Check [TimescaleDB docs](https://docs.timescale.com/) - Redis: Check [Redis docs](https://redis.io/documentation) - Docker: Check [Docker docs](https://docs.docker.com/) ## 🔄 Updates This infrastructure supports: - Rolling updates with zero downtime - Blue-green deployments - Automated failover - Data migration scripts