popov/gogo2

Fork 0

Files

Dobromir Popov 3eb74381a8 use experimental features; runtime fix

2025-12-08 17:35:16 +02:00

3.4 KiB

Raw Blame History

AMD GPU Compatibility Fix (gfx1151 - Radeon 8060S)

Problem

Your AMD Radeon 8060S (gfx1151) is not supported by the current PyTorch build, causing:

RuntimeError: HIP error: invalid device function

Current Setup

GPU: AMD Radeon 8060S (gfx1151)
PyTorch: 2.9.1+rocm6.4
System ROCm: 6.4.3

Solutions

Option 1: Use CPU Mode (Immediate - No reinstall needed)

The code now automatically falls back to CPU if GPU tests fail. Restart your application and it should work on CPU.

To force CPU mode explicitly, set environment variable:

export CUDA_VISIBLE_DEVICES=""
# or
export HSA_OVERRIDE_GFX_VERSION=11.0.0  # May help with gfx1151

Option 2: Try ROCm 6.4 Override (Quick test)

Some users report success forcing older architecture:

export HSA_OVERRIDE_GFX_VERSION=11.0.0
# Then restart your application

Option 3: Install PyTorch Nightly with gfx1151 Support

PyTorch nightly builds may have better gfx1151 support:

cd /mnt/shared/DEV/repos/d-popov.com/gogo2
source venv/bin/activate

# Uninstall current PyTorch
pip uninstall torch torchvision torchaudio -y

# Install PyTorch nightly for ROCm 6.4
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.4

Option 4: Build PyTorch from Source (Most reliable but time-consuming)

Build PyTorch specifically for gfx1151:

cd /tmp
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
git checkout main  # or stable release

# Set build options for gfx1151
export PYTORCH_ROCM_ARCH="gfx1151"
export USE_ROCM=1
export USE_CUDA=0

python setup.py install

Note: This takes 1-2 hours to compile.

Option 5: Use Docker with Pre-built ROCm PyTorch

Use official ROCm Docker images with PyTorch:

docker pull rocm/pytorch:latest
# Run your application inside this container

✅ CONFIRMED SOLUTION

Option 2 (HSA_OVERRIDE_GFX_VERSION) WORKS PERFECTLY!

The environment variable has been automatically added to your venv activation script.

What was done:

Added export HSA_OVERRIDE_GFX_VERSION=11.0.0 to venv/bin/activate
This allows gfx1151 to use gfx1100 libraries (fully compatible)
Added export TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 for Flash Efficient attention
All PyTorch operations now work on GPU with experimental optimizations

To apply:

# Deactivate and reactivate your venv
deactivate
source venv/bin/activate

# Or restart your application

Recommended Approach

✅ DONE: HSA_OVERRIDE_GFX_VERSION added to venv
Restart your application to use GPU
No PyTorch reinstallation needed!

Verification

After any fix, verify GPU support:

cd /mnt/shared/DEV/repos/d-popov.com/gogo2
source venv/bin/activate
python -c "
import torch
print(f'PyTorch: {torch.__version__}')
print(f'CUDA Available: {torch.cuda.is_available()}')
if torch.cuda.is_available():
    print(f'Device: {torch.cuda.get_device_name(0)}')
    # Test Linear layer
    x = torch.randn(2, 10).cuda()
    linear = torch.nn.Linear(10, 5).cuda()
    y = linear(x)
    print('GPU test passed!')
"

Current Status

✅ Code updated to automatically detect and fallback to CPU ⏳ Restart application to apply fix ❌ GPU training will not work until PyTorch is reinstalled with gfx1151 support

Performance Impact

CPU Mode: 10-50x slower than GPU for training
GPU Mode (after fix): Full GPU acceleration restored

3.4 KiB Raw Blame History