Axelera Metis M.2 Max Edge AI module

Axelera Metis M.2 Max Edge AI module

Oh baby, more research required

The new Metis M.2 Max also offers a slimmer profile, advanced thermal management features, and additional security capabilities. It is equipped with up to 16 GB of memory, and versions for both a standard operating temperature range (-20°C to +70°C) and an extended operating temperature range (-40°C to +85°C) will be offered. These enhancements make Metis M.2 Max ideal for applications in industrial manufacturing, retail, security, healthcare, and public safety.
Axelera Metis M.2 Max Edge AI module doubles LLM and VLM processing speed - CNX Software
Axelera AI’s Metis M.2 Max is an M.2 module based on an upgraded Metis AI processor unit (AIPU) delivering twice the memory bandwidth of the current Metis

I need to get on with moving my GPU to my LXC container on my Proxmox server.

Apparently, the process is considerably different to using it in a VM, then I need to get my own personal AI going again. The above module would be extremely handy to boost performance. Only have 1x slot left though, although I found an expansion card somewhere.

WARNING: The below is currently not tested. It is still in my to-do list.


🔎 Step 1: Identify Your GPU in Debian 13 (Proxmox Host)

Run:

lspci -nn | grep -E "VGA|3D|Display"

Example output:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080 Ti] [10de:1b06]

For AMD:

01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [1002:73bf]

You can also check kernel recognition:

dmesg | grep -i nvidia
dmesg | grep -i amdgpu

🛠 Step 2: Install GPU Drivers on the Host

NVIDIA

  1. Update system:
apt update && apt upgrade -y
apt install -y build-essential dkms pve-headers-$(uname -r)
  1. Download the latest driver from NVIDIA’s site. Example:
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/550.90.07/NVIDIA-Linux-x86_64-550.90.07.run
chmod +x NVIDIA-Linux-x86_64-550.90.07.run
./NVIDIA-Linux-x86_64-550.90.07.run --dkms
  1. Verify:
nvidia-smi

AMD

For ROCm compute:

apt install -y firmware-amd-graphics libdrm-amdgpu1

Then follow ROCm installation steps.


🛠 Step 3: Expose GPU Devices to LXC

  1. Find device nodes:
    • NVIDIA: /dev/nvidia0, /dev/nvidiactl, /dev/nvidia-uvm
    • AMD: /dev/dri/renderD128, /dev/kfd
  2. Edit the LXC config:
nano /etc/pve/lxc/<CTID>.conf

Add lines (NVIDIA example):

lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 511:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file

AMD example:

lxc.cgroup2.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.cgroup2.devices.allow: c 238:* rwm
lxc.mount.entry: /dev/kfd dev/kfd none bind,optional,create=file
  1. Restart the container:
pct restart <CTID>

🛠 Step 4: Install User-Space Libraries Inside the Container

  • For NVIDIA CUDA workloads:
apt install -y nvidia-cuda-toolkit
  • For AMD ROCm workloads:
apt install -y rocm-dev

Then test inside the container:

nvidia-smi   # NVIDIA
rocminfo     # AMD

⚡ Automation Script (Host-Side)

Here’s a script to detect GPU, install drivers, and prepare LXC passthrough:

#!/bin/bash
# proxmox-lxc-gpu.sh
# Automates GPU passthrough setup for LXC containers on Proxmox VE 9

set -e

echo "[*] Detecting GPU..."
GPU=$(lspci -nn | grep -E "VGA|3D|Display" || true)

if [[ $GPU == *"NVIDIA"* ]]; then
  echo "[*] NVIDIA GPU detected"
  apt update && apt install -y build-essential dkms pve-headers-$(uname -r)
  echo "[*] Please manually download and install the NVIDIA driver from nvidia.com"
elif [[ $GPU == *"AMD"* ]]; then
  echo "[*] AMD GPU detected"
  apt update && apt install -y firmware-amd-graphics libdrm-amdgpu1
  echo "[*] For compute workloads, install ROCm runtime inside the container"
else
  echo "[-] No supported GPU detected"
  exit 1
fi

echo "[+] Host GPU drivers installed. Now edit /etc/pve/lxc/<CTID>.conf to add device passthrough entries."

✅ Key Differences from VM Passthrough

  • Do NOT bind GPU to VFIO (that would hide it from the host, breaking LXC passthrough).
  • Drivers must be installed on the host.
  • Containers only need user‑space libraries, not kernel modules.

Leon, since you’re focused on auditability and runtime evidence, I’d suggest adding a post‑boot check like:

dmesg | grep -i nvidia
ls -l /dev/nvidia*

and logging these into your compliance overlay so you can prove the GPU is properly exposed at runtime.


👉 Do you want me to extend the automation script so it injects the correct lxc.cgroup2 and lxc.mount.entry lines automatically into a given container config, so you can just pass a CTID and GPU type? That would make it fully reproducible and auditable.

Work in progress

#enoughsaid