Ollama is an application that lets you download and run offline LLMs.
Generally for Linux:
curl -fsSL https://ollama.com/install.sh | sh
For Arch Linux you can install one of three packages
ollama
for CPU Modelsollama-cuda
for NVIDIA GPUsollama-rocm
for AMD GPUsThen start/enable the daemon
sudo systemctl start ollama.service
sudo systemctl enable ollama.service
Or simply run in a command line for quick testing
ollama serve
To run a model, first look in the Ollama library for the model name you want to use. For example
ollama run deepseek-r1:1.5b
This will download, run and start the prompt in a therminal with the LLM.
In the model, write
/?
for help and/bye
to exit.
Ollama has limited support for AMD discrete GPUs since the ROCm technology only support a few, and very recent models. Nevertheless, it is still possible to offload ollama’s processing into unsopported integrated or discrete GPUs by changing the HSA_OVERRIDE_GFX_VERSION
environment variable when launching ollama.
First we need to figure out the GFX version that applies to your particular GPU. Extracted from the Ollama Arch Wiki Page, first execute.
/opt/rocm/bin/rocminfo | grep amdhsa
This will query the exact GFX version to your system. It will give you either three of four digits.
XXYZ
.XYZ
Remember these digits, as your GFX version is either XX.Y.Z
or X.Y.Z
depending on whether you got three or four digits.
Then find all of the installed rocblas
kernels
find /opt/rocm/lib/rocblas/library -name 'Kernels.so-*'
And find the closest to your actual version following these rules
X
part, it must be strictly equal to the actual version.Y
part, mismatch is allowed, but it must be no greater than the the actual version.Z
part, mismatch is allowed, but it must be no greater than the the actual version.As a test, if running, stop the ollama daemon
sudo systemctl stop ollama.service
Execute the following with your GFX version
HSA_OVERRIDE_GFX_VERSION=X.Y.Z ollama serve
and run a model to test if the GPU is now being used. If it is still not being used, follow the same steps to figure out the correct GFX version until you get the right numbers.
Once you made sure which is the correct environment variable, edit the ollama daemon
sudo systemctl edit --full ollama.service
This will open the daemon configuration file, look for the [Service]
section and add
[Service]
Environment="HSA_OVERRIDE_GFX_VERSION=X.Y.Z"
ExecStart=/usr/bin/ollama serve
Finally, perform a daemon reload
sudo systemctl daemon-reload
and restart the ollama daemon
sudo systemctl restart ollama.service