popov/gogo2

Fork 0

Files

Dobromir Popov 43a7d75daf try fixing GPU (torch)

2025-11-17 13:06:39 +02:00

4.0 KiB

Raw Blame History

Using Existing ROCm Container for Development

Current Status

✅ You already have ROCm PyTorch working on the host!

PyTorch: 2.5.1+rocm6.2
CUDA available: True
Device: AMD Radeon Graphics (Strix Halo)
Memory: 47.0 GB

Recommendation: Use Host Environment

Since your host venv already has ROCm support working, this is the simplest option:

cd /mnt/shared/DEV/repos/d-popov.com/gogo2
source venv/bin/activate
python ANNOTATE/web/app.py

Benefits:

✅ Already configured
✅ No container overhead
✅ Direct file access
✅ GPU works perfectly

Alternative: Use Existing Container

You have these containers running:

amd-strix-halo-llama-rocm - ROCm 7rc (port 8080)
amd-strix-halo-llama-vulkan-radv - Vulkan RADV (port 8081)
amd-strix-halo-llama-vulkan-amdvlk - Vulkan AMDVLK (port 8082)

Option 1: Quick Attach Script

./scripts/attach-to-rocm-container.sh

This script will:

Check if project is accessible in container
Offer to copy project if needed
Check/install Python if needed
Check/install PyTorch if needed
Attach you to a bash shell

Option 2: Manual Setup

A. Copy Project to Container

# Create workspace in container
docker exec amd-strix-halo-llama-rocm mkdir -p /workspace

# Copy project
docker cp /mnt/shared/DEV/repos/d-popov.com/gogo2 amd-strix-halo-llama-rocm:/workspace/

# Enter container
docker exec -it amd-strix-halo-llama-rocm bash

B. Install Python (if needed)

Inside container:

# Fedora-based container
dnf install -y python3.12 python3-pip python3-devel git

# Create symlinks
ln -sf /usr/bin/python3.12 /usr/bin/python3
ln -sf /usr/bin/python3.12 /usr/bin/python

C. Install Dependencies

Inside container:

cd /workspace/gogo2

# Install PyTorch with ROCm
pip3 install torch --index-url https://download.pytorch.org/whl/rocm6.2

# Install project dependencies
pip3 install -r requirements.txt

D. Run Application

# Run ANNOTATE dashboard
python3 ANNOTATE/web/app.py

# Or run training
python3 training_runner.py --mode realtime --duration 4

Option 3: Mount Project on Container Restart

Add volume mount to your docker-compose:

services:
  amd-strix-halo-llama-rocm:
    volumes:
      - /mnt/shared/DEV/repos/d-popov.com/gogo2:/workspace/gogo2:rw

Then restart:

docker-compose down
docker-compose up -d

Port Conflicts

Your ROCm container uses port 8080, which conflicts with COBY API.

Solutions:

Use host environment (no conflict)

Change ANNOTATE port in container:

python3 ANNOTATE/web/app.py --port 8051

Expose different port when starting container

Comparison

Aspect	Host (venv)	Container
Setup	✅ Already done	⚠️ Needs Python install
GPU	✅ Working	✅ Should work
Files	✅ Direct access	⚠️ Need to copy/mount
Performance	✅ Native	⚠️ Small overhead
Isolation	⚠️ Shares host	✅ Isolated
Simplicity	✅ Just works	⚠️ Extra steps

Quick Commands

Host Development (Recommended)

cd /mnt/shared/DEV/repos/d-popov.com/gogo2
source venv/bin/activate
python ANNOTATE/web/app.py

Container Development

# Method 1: Use helper script
./scripts/attach-to-rocm-container.sh

# Method 2: Manual attach
docker exec -it amd-strix-halo-llama-rocm bash
cd /workspace/gogo2
python3 ANNOTATE/web/app.py

Check GPU in Container

docker exec amd-strix-halo-llama-rocm rocm-smi
docker exec amd-strix-halo-llama-rocm python3 -c "import torch; print(torch.cuda.is_available())"

Summary

For your use case (avoid heavy downloads):

→ Use the host environment - Your venv already has everything working perfectly!

Only use container if you need:

Complete isolation from host
Specific ROCm version testing
Multiple parallel environments

Last Updated: 2025-11-12
Status: Host venv with ROCm 6.2 is ready to use

4.0 KiB Raw Blame History