using LLM for sentiment analysis

2025-09-25 00:52:01 +03:00
parent 1f35258a66
commit d68c915fd5
21 changed files with 2767 additions and 780 deletions
--- a/MODEL_RUNNER_README.md
+++ b/MODEL_RUNNER_README.md
@@ -0,0 +1,383 @@
+# Docker Model Runner Integration
+
+This guide shows how to integrate Docker Model Runner with your existing Docker stack for AI-powered trading applications.
+
+## 📁 Files Overview
+
+| File | Purpose |
+|------|---------|
+| `docker-compose.yml` | Main compose file with model runner services |
+| `docker-compose.model-runner.yml` | Standalone model runner configuration |
+| `model-runner.env` | Environment variables for configuration |
+| `integrate_model_runner.sh` | Integration script for existing stacks |
+| `docker-compose.integration-example.yml` | Example integration with trading services |
+
+## 🚀 Quick Start
+
+### Option 1: Use with Existing Stack
+```bash
+# Run integration script
+./integrate_model_runner.sh
+
+# Start services
+docker-compose up -d
+
+# Test API
+curl http://localhost:11434/api/tags
+```
+
+### Option 2: Standalone Model Runner
+```bash
+# Use dedicated compose file
+docker-compose -f docker-compose.model-runner.yml up -d
+
+# Test with specific profile
+docker-compose -f docker-compose.model-runner.yml --profile llama-cpp up -d
+```
+
+## 🔧 Configuration
+
+### Environment Variables (`model-runner.env`)
+
+```bash
+# AMD GPU Configuration
+HSA_OVERRIDE_GFX_VERSION=11.0.0  # AMD GPU version override
+GPU_LAYERS=35              # Layers to offload to GPU
+THREADS=8                  # CPU threads
+BATCH_SIZE=512             # Batch processing size
+CONTEXT_SIZE=4096          # Context window size
+
+# API Configuration
+MODEL_RUNNER_PORT=11434    # Main API port
+LLAMA_CPP_PORT=8000        # Llama.cpp server port
+METRICS_PORT=9090          # Metrics endpoint
+```
+
+### Ports Exposed
+
+| Port | Service | Purpose |
+|------|---------|---------|
+| 11434 | Docker Model Runner | Ollama-compatible API |
+| 8083  | Docker Model Runner | Alternative API port |
+| 8000  | Llama.cpp Server | Advanced llama.cpp features |
+| 9090  | Metrics | Prometheus metrics |
+| 8050  | Trading Dashboard | Example dashboard |
+| 9091  | Model Monitor | Performance monitoring |
+
+## 🛠️ Usage Examples
+
+### Basic Model Operations
+
+```bash
+# List available models
+curl http://localhost:11434/api/tags
+
+# Pull a model
+docker-compose exec docker-model-runner /app/model-runner pull ai/smollm2:135M-Q4_K_M
+
+# Run a model
+docker-compose exec docker-model-runner /app/model-runner run ai/smollm2:135M-Q4_K_M "Hello!"
+
+# Pull Hugging Face model
+docker-compose exec docker-model-runner /app/model-runner pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF
+```
+
+### API Usage
+
+```bash
+# Generate text (OpenAI-compatible)
+curl -X POST http://localhost:11434/api/generate \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "ai/smollm2:135M-Q4_K_M",
+    "prompt": "Analyze market trends",
+    "temperature": 0.7,
+    "max_tokens": 100
+  }'
+
+# Chat completion
+curl -X POST http://localhost:11434/api/chat \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "ai/smollm2:135M-Q4_K_M",
+    "messages": [{"role": "user", "content": "What is your analysis?"}]
+  }'
+```
+
+### Integration with Your Services
+
+```python
+# Example: Python integration
+import requests
+
+class AIModelClient:
+    def __init__(self, base_url="http://localhost:11434"):
+        self.base_url = base_url
+
+    def generate(self, prompt, model="ai/smollm2:135M-Q4_K_M"):
+        response = requests.post(
+            f"{self.base_url}/api/generate",
+            json={"model": model, "prompt": prompt}
+        )
+        return response.json()
+
+    def chat(self, messages, model="ai/smollm2:135M-Q4_K_M"):
+        response = requests.post(
+            f"{self.base_url}/api/chat",
+            json={"model": model, "messages": messages}
+        )
+        return response.json()
+
+# Usage
+client = AIModelClient()
+analysis = client.generate("Analyze BTC/USDT market")
+```
+
+## 🔗 Service Integration
+
+### With Existing Trading Dashboard
+
+```yaml
+# Add to your existing docker-compose.yml
+services:
+  your-trading-service:
+    # ... your existing config
+    environment:
+      - MODEL_RUNNER_URL=http://docker-model-runner:11434
+    depends_on:
+      - docker-model-runner
+    networks:
+      - model-runner-network
+```
+
+### Internal Networking
+
+Services communicate using Docker networks:
+- `http://docker-model-runner:11434` - Internal API calls
+- `http://llama-cpp-server:8000` - Advanced features
+- `http://model-manager:8001` - Management API
+
+## 📊 Monitoring and Health Checks
+
+### Health Endpoints
+
+```bash
+# Main service health
+curl http://localhost:11434/api/tags
+
+# Metrics endpoint
+curl http://localhost:9090/metrics
+
+# Model monitor (if enabled)
+curl http://localhost:9091/health
+curl http://localhost:9091/models
+curl http://localhost:9091/performance
+```
+
+### Logs
+
+```bash
+# View all logs
+docker-compose logs -f
+
+# Specific service logs
+docker-compose logs -f docker-model-runner
+docker-compose logs -f llama-cpp-server
+```
+
+## ⚡ Performance Tuning
+
+### GPU Optimization
+
+```bash
+# Adjust GPU layers based on VRAM
+GPU_LAYERS=35              # For 8GB VRAM
+GPU_LAYERS=50              # For 12GB VRAM
+GPU_LAYERS=65              # For 16GB+ VRAM
+
+# CPU threading
+THREADS=8                  # Match CPU cores
+BATCH_SIZE=512            # Increase for better throughput
+```
+
+### Memory Management
+
+```bash
+# Context size affects memory usage
+CONTEXT_SIZE=4096         # Standard context
+CONTEXT_SIZE=8192         # Larger context (more memory)
+CONTEXT_SIZE=2048         # Smaller context (less memory)
+```
+
+## 🧪 Testing and Validation
+
+### Run Integration Tests
+
+```bash
+# Test basic connectivity
+docker-compose exec docker-model-runner curl -f http://localhost:11434/api/tags
+
+# Test model loading
+docker-compose exec docker-model-runner /app/model-runner run ai/smollm2:135M-Q4_K_M "test"
+
+# Test parallel requests
+for i in {1..5}; do
+  curl -X POST http://localhost:11434/api/generate \
+    -H "Content-Type: application/json" \
+    -d '{"model": "ai/smollm2:135M-Q4_K_M", "prompt": "test '$i'"}' &
+done
+```
+
+### Benchmarking
+
+```bash
+# Simple benchmark
+time curl -X POST http://localhost:11434/api/generate \
+  -H "Content-Type: application/json" \
+  -d '{"model": "ai/smollm2:135M-Q4_K_M", "prompt": "Write a detailed analysis of market trends"}'
+```
+
+## 🛡️ Security Considerations
+
+### Network Security
+
+```yaml
+# Restrict network access
+services:
+  docker-model-runner:
+    networks:
+      - internal-network
+    # No external ports for internal-only services
+
+networks:
+  internal-network:
+    internal: true
+```
+
+### API Security
+
+```bash
+# Use API keys (if supported)
+MODEL_RUNNER_API_KEY=your-secret-key
+
+# Enable authentication
+MODEL_RUNNER_AUTH_ENABLED=true
+```
+
+## 📈 Scaling and Production
+
+### Multiple GPU Support
+
+```yaml
+# Use multiple GPUs
+environment:
+  - CUDA_VISIBLE_DEVICES=0,1  # Use GPU 0 and 1
+  - GPU_LAYERS=35             # Layers per GPU
+```
+
+### Load Balancing
+
+```yaml
+# Multiple model runner instances
+services:
+  model-runner-1:
+    # ... config
+    deploy:
+      placement:
+        constraints:
+          - node.labels.gpu==true
+
+  model-runner-2:
+    # ... config
+    deploy:
+      placement:
+        constraints:
+          - node.labels.gpu==true
+```
+
+## 🔧 Troubleshooting
+
+### Common Issues
+
+1. **GPU not detected**
+   ```bash
+   # Check NVIDIA drivers
+   nvidia-smi
+
+   # Check Docker GPU support
+   docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
+   ```
+
+2. **Port conflicts**
+   ```bash
+   # Check port usage
+   netstat -tulpn | grep :11434
+
+   # Change ports in model-runner.env
+   MODEL_RUNNER_PORT=11435
+   ```
+
+3. **Model loading failures**
+   ```bash
+   # Check available disk space
+   df -h
+
+   # Check model file permissions
+   ls -la models/
+   ```
+
+### Debug Commands
+
+```bash
+# Full service logs
+docker-compose logs
+
+# Container resource usage
+docker stats
+
+# Model runner debug info
+docker-compose exec docker-model-runner /app/model-runner --help
+
+# Test internal connectivity
+docker-compose exec trading-dashboard curl http://docker-model-runner:11434/api/tags
+```
+
+## 📚 Advanced Features
+
+### Custom Model Loading
+
+```bash
+# Load custom GGUF model
+docker-compose exec docker-model-runner /app/model-runner pull /models/custom-model.gguf
+
+# Use specific model file
+docker-compose exec docker-model-runner /app/model-runner run /models/my-model.gguf "prompt"
+```
+
+### Batch Processing
+
+```bash
+# Process multiple prompts
+curl -X POST http://localhost:11434/api/generate \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "ai/smollm2:135M-Q4_K_M",
+    "prompt": ["prompt1", "prompt2", "prompt3"],
+    "batch_size": 3
+  }'
+```
+
+### Streaming Responses
+
+```bash
+# Enable streaming
+curl -X POST http://localhost:11434/api/generate \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "ai/smollm2:135M-Q4_K_M",
+    "prompt": "long analysis request",
+    "stream": true
+  }'
+```
+
+This integration provides a complete AI model running environment that seamlessly integrates with your existing trading infrastructure while providing advanced parallelism and GPU acceleration capabilities.