Proxmox & Kernel 7 & Nvidia

Proxmox & Kernel 7 & Nvidia
Proxmox 9.1 GPU Configuration

Yikes is my bottom line at the moment.

There is a significant issue with Kernel 7 and various Nvidia graphics cards. Mine is rather dated which is becoming a problem.

GPU Script for Host setup

Please find below the script. This has been refined on updated with every problem discovered on the updating of the Proxmox host over time. Updates are a manual affair as I have learned this the hardway.

#!/bin/bash
# -------------------------------------------------------------------------------------
# setup-gpu-pxe.sh — Proxmox VE NVIDIA Driver Host Installer + LXC GPU Configurator
# -------------------------------------------------------------------------------------
#
# VERSION:  4.0.0
# CREATED:  see git history — github.com/Braedach
# UPDATED:  2026-06-13
# TARGET:   Proxmox VE 9.x (Debian 13 / Trixie)
# CARD:     NVIDIA RTX 3070 (or any NVIDIA GPU)
# AUTHOR:   braedach / Leon
#
# -----------------------------------------------------------------------------
# WHAT THIS SCRIPT DOES
# -----------------------------------------------------------------------------
# Installs and configures NVIDIA drivers on the Proxmox HOST, generates the
# correct LXC container configuration for GPU passthrough, AND guards the host
# against the kernel-series jump that breaks NVIDIA DKMS (see WHY v4.0.0).
#
# The LXC passthrough model:
#   - NVIDIA kernel modules live ONLY on the host (never inside a container)
#   - Containers access the GPU via bind-mounted /dev/nvidia* device nodes
#   - The host creates and owns all device nodes; containers just use them
#   - Any kernel/driver update must be done on the HOST only
#
# -----------------------------------------------------------------------------
# WHY v4.0.0 IS A MAJOR BUMP  (read this — it is the whole point)
# -----------------------------------------------------------------------------
# On 2026-06-12 a routine SECURITY update on pxe (proxmox-kernel-6.17
# 6.17.13-2 -> 6.17.13-13) silently dragged in an ENTIRELY NEW kernel series:
# proxmox-kernel-7.0 (Ubuntu 26.04 "Resolute" base). The bump came through the
# proxmox-default-kernel meta-package (2.0.2 -> 2.1.0), which Proxmox is moving
# toward making 7.0 the default for the 9.2 release.
#
# The NVIDIA 550.163.01 driver (Debian-packaged, trixie-backports) does NOT
# build against Linux 7.0 — Debian's own packaging notes the module is only
# validated up to Linux 6.19. The DKMS autoinstall for 7.0 failed mid-apt,
# which:
#   1. wedged dpkg (three packages left half-configured)
#   2. left a 7.0 kernel that GRUB would sort highest and try to boot by
#      default — a kernel with a broken GPU stack
#
# Recovery required pinning back to 6.17 and neutralising the failed build.
# This is NOT unique to this host: multiple Proxmox users with NVIDIA cards hit
# the same "GPU not detected on 7.0, pin back to 6.17" wall.
#
# v4.0.0 turns that silent break into a LOUD, DELIBERATE decision by adding a
# kernel-series guard. The host stays on an APPROVED series until you have
# CONFIRMED NVIDIA supports the next one — then you approve it on purpose.
#
# -----------------------------------------------------------------------------
# THE GUARD — THREE INDEPENDENT LAYERS
# -----------------------------------------------------------------------------
#   LAYER 1  PREFLIGHT GATE   (--preflight)
#     Simulates 'apt-get full-upgrade', inspects every kernel/header package it
#     would install, and BLOCKS (non-zero exit) if any belong to a series not
#     in APPROVED_KERNEL_SERIES. Run this BEFORE every apt full-upgrade.
#
#   LAYER 2  APT HOLDS        (--apply-guard)
#     apt-mark hold on proxmox-default-kernel / proxmox-default-headers (the
#     series selector) plus any installed kernel/header packages from an
#     unapproved series (e.g. the 7.0 packages). SURGICAL: 6.17 security point
#     updates still flow, because they arrive via proxmox-kernel-6.17 which is
#     NOT held. Held packages simply show as "kept back".
#
#   LAYER 3  BOOT PIN         (--apply-guard, via proxmox-boot-tool)
#     Pins the running kernel as the GRUB default so that even if a bad kernel
#     somehow installs, the host never boots it. Only pins a kernel that has a
#     built NVIDIA DKMS module.
#
#   Layers are independent on purpose: Layer 1 stops the bad kernel arriving,
#   Layer 2 stops it installing, Layer 3 stops it booting.
#
# -----------------------------------------------------------------------------
# APPROVING A NEW KERNEL SERIES (when NVIDIA catches up)
# -----------------------------------------------------------------------------
#   1. CONFIRM a Debian/backports NVIDIA driver builds against the new series.
#   2. Add the series to APPROVED_KERNEL_SERIES below (e.g. "7.0").
#   3. sudo ./setup-gpu-pxe.sh --remove-guard   (lifts the apt holds)
#   4. sudo ./setup-gpu-pxe.sh --preflight      (should now pass)
#   5. apt full-upgrade, verify DKMS, reboot, re-pin with --apply-guard.
#
# -----------------------------------------------------------------------------
# ARCHITECTURE — THREE PILLARS (driver/passthrough side, unchanged from 3.x)
# -----------------------------------------------------------------------------
# PILLAR 1 — DRIVER INSTALLATION (HOST ONLY)
#   - Ensures non-free / non-free-firmware + trixie-backports are enabled
#   - Installs nvidia-kernel-dkms + nvidia-driver + nvidia-persistenced from
#     trixie-backports (the 6.17-compatible Debian-packaged driver)
#   - Installs pve-headers for the running Proxmox kernel
#   - DKMS build is performed by apt; verified/self-healed here
#   - Blacklists nouveau; provides Secure Boot / MOK guidance
#
# PILLAR 2 — DEVICE NODE STABILITY (solves the chronic UVM race)
#   nvidia-setup-nodes.service runs at sysinit (Before=lxc / pve-container),
#   creating /dev/nvidia-uvm* before any container starts, so LXC binds real
#   char devices instead of empty stub files. udev rules + nvidia-persistenced
#   keep nodes and GPU state stable across container restarts.
#
# PILLAR 3 — LXC CONTAINER CONFIGURATION (modern dev* syntax)
#   Uses Proxmox 8.1+ dev* directive (automatic device-type detection and
#   cgroup2 permissions). print_lxc_config is the authoritative dev* list:
#     dev0: /dev/nvidia0,gid=44
#     dev1: /dev/nvidiactl,gid=44
#     dev2: /dev/nvidia-modeset,gid=44
#     dev3: /dev/nvidia-uvm,gid=44
#     dev4: /dev/nvidia-uvm-tools,gid=44
#     dev5: /dev/nvidia-caps/nvidia-cap1,gid=44
#     dev6: /dev/nvidia-caps/nvidia-cap2,gid=44
#   gid=44 is the 'video' group on Debian-based systems.
#
# -----------------------------------------------------------------------------
# WHAT LIVES WHERE
# -----------------------------------------------------------------------------
#   /etc/systemd/system/nvidia-setup-nodes.service   boot-ordering service
#   /usr/local/sbin/nvidia-setup-nodes.sh            node creation script
#   /etc/udev/rules.d/71-nvidia-uvm.rules            udev permissions
#   /etc/modprobe.d/blacklist-nouveau.conf           nouveau blacklist
#   /etc/modules-load.d/nvidia.conf                  module autoload
#
# -----------------------------------------------------------------------------
# USAGE
# -----------------------------------------------------------------------------
#   sudo ./setup-gpu-pxe.sh                  Full install (+ applies guard)
#   sudo ./setup-gpu-pxe.sh --preflight      Gate an upgrade BEFORE apt
#   sudo ./setup-gpu-pxe.sh --apply-guard    Apply holds + pin running kernel
#   sudo ./setup-gpu-pxe.sh --guard-status   Show holds, pin, approved series
#   sudo ./setup-gpu-pxe.sh --remove-guard   Lift holds (new series approved)
#   sudo ./setup-gpu-pxe.sh --check-only     System + GPU health
#   sudo ./setup-gpu-pxe.sh --force-rebuild  Force DKMS rebuild
#   sudo ./setup-gpu-pxe.sh --purge          Remove all NVIDIA components
#   sudo ./setup-gpu-pxe.sh --lxc-config ID  Print LXC config snippet
#   sudo ./setup-gpu-pxe.sh --dry-run        Preview any of the above
#   sudo ./setup-gpu-pxe.sh --help           Show help
#
# -----------------------------------------------------------------------------
# RECOMMENDED KERNEL UPDATE SOP (manual, for production)
# -----------------------------------------------------------------------------
#   1. apt update
#   2. sudo ./setup-gpu-pxe.sh --preflight     <-- HARD GATE; stop if it blocks
#   3. apt full-upgrade                        (use full-upgrade, not upgrade)
#   4. dkms status                             confirm new kernel = installed
#   5. reboot                                  into the new kernel
#   6. sudo ./setup-gpu-pxe.sh --check-only    verify GPU on new kernel
#   7. sudo ./setup-gpu-pxe.sh --apply-guard   re-pin the now-running kernel
#   8. start passthrough LXCs; nvidia-smi inside each
#
# -----------------------------------------------------------------------------
# PREREQUISITES
# -----------------------------------------------------------------------------
#   - Proxmox VE 9.x host, run as root
#   - Internet access to deb.debian.org (non-free + trixie-backports)
#   - An NVIDIA GPU present on the host (verified via lspci)
#   - proxmox-headers-* for the running kernel available in apt
#
# -----------------------------------------------------------------------------
# WARNINGS
# -----------------------------------------------------------------------------
#   - Driver / kernel changes are HOST-level. NEVER install the NVIDIA driver
#     inside an LXC container.
#   - Do NOT lift the guard (--remove-guard) until you have CONFIRMED NVIDIA
#     builds against the new kernel series. Lifting it re-opens the exact
#     failure mode that triggered v4.0.0.
#   - Always confirm 'dkms status' shows the NEW kernel as installed BEFORE
#     rebooting into it after a kernel update.
#   - Use 'apt full-upgrade', never 'apt upgrade', on Proxmox.
#
# -----------------------------------------------------------------------------
# CHANGELOG  (full history: github.com/Braedach)
# -----------------------------------------------------------------------------
# v4.0.0 — 2026-06-13 — Kernel-series guard (MAJOR)
#   CONTEXT: a 6.17 security update silently pulled the 7.0 kernel series via
#            proxmox-default-kernel; NVIDIA 550 cannot build against 7.0; the
#            failed DKMS autoinstall wedged dpkg and left a bootable-but-broken
#            7.0 kernel. See "WHY v4.0.0" above.
#   - NEW: --preflight. Simulates apt full-upgrade and BLOCKS if any kernel or
#          header package from an unapproved series would be installed.
#   - NEW: --apply-guard. apt-mark holds the series selector + unapproved-series
#          kernel/header packages, then pins the running kernel via
#          proxmox-boot-tool (only if it has a built NVIDIA DKMS module).
#   - NEW: --remove-guard. Lifts the holds, with confirmation, for when a new
#          series has been approved.
#   - NEW: --guard-status. Reports approved series, current apt holds, and the
#          pinned boot kernel.
#   - NEW: APPROVED_KERNEL_SERIES constant (default: 6.17). Single source of
#          truth for which series NVIDIA is allowed to follow.
#   - CHANGE: full install now applies the guard at the end (holds + pin).
#   - FIX: --check-only no longer false-alarms "modules NOT loaded". It now
#          warms the GPU (nvidia-smi) first; if the driver is functional but a
#          module is idle-unloaded, it reports informationally instead of red.
#   - FIX: --check-only reports the boot pin and guard status.
#   - DOC: added the manual kernel update SOP; reiterated full-upgrade.
#
# v3.6.0 — 2026-06-12 — Kernel-update hardening + doc reconciliation
#   - DKMS self-heal for the running kernel; explicit nvidia-persistenced
#     install; Pillar 3 docstring reconciled to 7 dev* entries; CUDA_* dead
#     constants removed. (Full detail in git.)
#
# Earlier versions (CUDA-repo era, the SHA1 key rejection, the trixie-backports
# migration, the Pillar 2/3 rework, etc.) are recorded in git history.
#
# -------------------------------------------------------------------------------------

set -euo pipefail
IFS=$'\n\t'

SCRIPT_VERSION="4.0.0"
SCRIPT_NAME="$(basename "$0")"

# -------------------------------------------------------------------------------------
# KERNEL SERIES POLICY  (edit APPROVED_KERNEL_SERIES to approve a new series)
# -------------------------------------------------------------------------------------
# Only kernel series listed here are allowed to be installed/followed. A series
# is the X.Y portion of a proxmox kernel package, e.g. "6.17" or "7.0".
# Add a new series ONLY after confirming NVIDIA builds against it (see header).
APPROVED_KERNEL_SERIES=("6.17")

# -------------------------------------------------------------------------------------
# CONFIGURATION CONSTANTS
# -------------------------------------------------------------------------------------
UVM_RULE_FILE="/etc/udev/rules.d/71-nvidia-uvm.rules"
NOUVEAU_BLACKLIST="/etc/modprobe.d/blacklist-nouveau.conf"
MODULES_LOAD_FILE="/etc/modules-load.d/nvidia.conf"
NODE_SCRIPT="/usr/local/sbin/nvidia-setup-nodes.sh"
NODE_SERVICE="/etc/systemd/system/nvidia-setup-nodes.service"
LXC_VIDEO_GID=44   # 'video' group on Debian systems

# Selector meta-packages that decide which kernel series is the default.
SELECTOR_PACKAGES=(proxmox-default-kernel proxmox-default-headers)

# -------------------------------------------------------------------------------------
# FLAGS (set by argument parsing)
# -------------------------------------------------------------------------------------
DRY_RUN=0
PURGE=0
FORCE_REBUILD=0
CHECK_ONLY=0
LXC_CONFIG_ONLY=0
PREFLIGHT_ONLY=0
APPLY_GUARD_ONLY=0
REMOVE_GUARD_ONLY=0
GUARD_STATUS_ONLY=0
LXC_VMID=""

# -------------------------------------------------------------------------------------
# COLOUR / LOGGING
# -------------------------------------------------------------------------------------
RED='\033[0;31m'; YELLOW='\033[0;33m'; GREEN='\033[0;32m'
CYAN='\033[0;36m'; BOLD='\033[1m'; RESET='\033[0m'

info()    { echo -e "${GREEN}[INFO]${RESET}  $*"; }
warn()    { echo -e "${YELLOW}[WARN]${RESET}  $*" >&2; }
error()   { echo -e "${RED}[ERROR]${RESET} $*" >&2; }
section() { echo -e "\n${BOLD}${CYAN}=== $* ===${RESET}"; }
ok()      { echo -e "  ${GREEN}[ok]${RESET} $*"; }
fail()    { echo -e "  ${RED}[x]${RESET} $*"; }
skip()    { echo -e "  ${YELLOW}[-]${RESET} $*"; }

run() {
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    echo -e "  ${CYAN}[dry-run]${RESET} $*"
  else
    eval "$@"
  fi
}

# -------------------------------------------------------------------------------------
# KERNEL SERIES HELPERS
# -------------------------------------------------------------------------------------
# Extract the X.Y series from a proxmox kernel/header package name.
# proxmox-kernel-6.17.13-13-pve  -> 6.17
# proxmox-kernel-7.0.6-2-pve     -> 7.0
# proxmox-kernel-6.17            -> 6.17   (series meta-package)
# proxmox-default-kernel         -> ""     (selector; handled separately)
kernel_series_from_pkg() {
  local pkg="$1" rest=""
  rest="${pkg#proxmox-kernel-}"
  if [[ "$rest" == "$pkg" ]]; then
    rest="${pkg#proxmox-headers-}"
  fi
  if [[ "$rest" == "$pkg" ]]; then
    echo ""
    return
  fi
  echo "$rest" | grep -oE '^[0-9]+\.[0-9]+' || true
}

is_approved_series() {
  local s="$1" a
  for a in "${APPROVED_KERNEL_SERIES[@]}"; do
    [[ "$s" == "$a" ]] && return 0
  done
  return 1
}

# -------------------------------------------------------------------------------------
# PREREQUISITES
# -------------------------------------------------------------------------------------
check_root() {
  [[ "$(id -u)" -eq 0 ]] || { error "Must run as root. Use: sudo $0 $*"; exit 1; }
}

check_prerequisites() {
  section "Checking Prerequisites"
  local missing=()
  for cmd in systemctl apt apt-mark dpkg dpkg-query wget lspci modprobe dkms mokutil; do
    if command -v "$cmd" &>/dev/null; then
      ok "$cmd found"
    else
      fail "$cmd missing"
      missing+=("$cmd")
    fi
  done
  # mokutil and proxmox-boot-tool are optional
  if [[ ${#missing[@]} -gt 0 ]]; then
    local required_missing=()
    for m in "${missing[@]}"; do
      [[ "$m" != "mokutil" ]] && required_missing+=("$m")
    done
    if [[ ${#required_missing[@]} -gt 0 ]]; then
      error "Required tools missing: ${required_missing[*]}"
      error "Install with: apt install -y ${required_missing[*]}"
      exit 1
    fi
  fi
}

# -------------------------------------------------------------------------------------
# GPU DETECTION
# -------------------------------------------------------------------------------------
detect_nvidia_gpu() {
  section "Detecting NVIDIA GPU"
  local gpu_list
  gpu_list="$(lspci | grep -iE '(vga|3d|display)' || true)"

  if echo "$gpu_list" | grep -qi nvidia; then
    local nvidia_line
    nvidia_line="$(echo "$gpu_list" | grep -i nvidia | head -1)"
    ok "NVIDIA GPU detected: ${nvidia_line}"
    if [[ "$(echo "$gpu_list" | wc -l)" -gt 1 ]]; then
      info "All display controllers detected:"
      while IFS= read -r line; do
        info "  $line"
      done <<< "$gpu_list"
    fi
    return 0
  else
    error "No NVIDIA GPU found via lspci."
    info "lspci output:"
    echo "$gpu_list"
    exit 1
  fi
}

# -------------------------------------------------------------------------------------
# DEBIAN SOURCES — NON-FREE + TRIXIE-BACKPORTS
# -------------------------------------------------------------------------------------
ensure_debian_sources() {
  section "Ensuring Debian Sources (non-free + trixie-backports)"
  local sources_file="/etc/apt/sources.list"
  local needs_update=0

  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] Would verify non-free and trixie-backports in apt sources"
    return
  fi

  if grep -rq 'non-free-firmware' /etc/apt/sources.list /etc/apt/sources.list.d/ 2>/dev/null; then
    ok "non-free-firmware already in sources"
  else
    warn "non-free-firmware not found in apt sources"
    warn "Adding to ${sources_file} — review if this causes issues"
    if ! grep -q 'deb.debian.org/debian trixie' "${sources_file}" 2>/dev/null; then
      echo "" >> "${sources_file}"
      echo "# Added by setup-gpu-pxe.sh for NVIDIA drivers" >> "${sources_file}"
      echo "deb http://deb.debian.org/debian trixie main contrib non-free non-free-firmware" >> "${sources_file}"
      ok "Added Debian trixie non-free line to ${sources_file}"
    else
      if grep 'deb.debian.org/debian trixie' "${sources_file}" | grep -qv 'non-free'; then
        warn "Found trixie line without non-free — please add non-free non-free-firmware manually:"
        warn "  Edit: ${sources_file}"
        warn "  Change: 'deb http://deb.debian.org/debian trixie main'"
        warn "  To:     'deb http://deb.debian.org/debian trixie main contrib non-free non-free-firmware'"
      else
        ok "non-free appears to be present in trixie sources"
      fi
    fi
    needs_update=1
  fi

  if grep -rq 'trixie-backports' /etc/apt/sources.list /etc/apt/sources.list.d/ 2>/dev/null; then
    ok "trixie-backports already in sources"
  else
    info "Adding trixie-backports to ${sources_file}..."
    echo "" >> "${sources_file}"
    echo "# Added by setup-gpu-pxe.sh — required for NVIDIA drivers on kernel 6.17+" >> "${sources_file}"
    echo "deb http://deb.debian.org/debian trixie-backports main contrib non-free non-free-firmware" >> "${sources_file}"
    ok "trixie-backports added"
    needs_update=1
  fi

  if [[ "$needs_update" -eq 1 ]]; then
    apt_update
  fi
}

# -------------------------------------------------------------------------------------
# APT HELPERS WITH RETRY
# -------------------------------------------------------------------------------------
apt_update() {
  local attempt max_attempts=3 delay=5
  for ((attempt=1; attempt<=max_attempts; attempt++)); do
    if [[ "${DRY_RUN}" -eq 1 ]]; then
      info "[dry-run] apt-get update"
      return 0
    fi
    info "apt-get update (attempt ${attempt}/${max_attempts})..."
    if DEBIAN_FRONTEND=noninteractive apt-get update -qq; then
      return 0
    fi
    warn "apt update failed (attempt ${attempt}). Retrying in ${delay}s..."
    sleep "$delay"
    delay=$((delay * 2))
  done
  error "apt-get update failed after ${max_attempts} attempts."
  return 1
}

download_with_retry() {
  local url="$1" dest="$2"
  local attempt max_attempts=3 delay=5
  for ((attempt=1; attempt<=max_attempts; attempt++)); do
    if [[ "${DRY_RUN}" -eq 1 ]]; then
      info "[dry-run] wget -q -O '${dest}' '${url}'"
      return 0
    fi
    info "Downloading: $(basename "$url") (attempt ${attempt}/${max_attempts})..."
    if wget -q --timeout=60 -O "$dest" "$url"; then
      ok "Downloaded: $(basename "$url")"
      return 0
    fi
    warn "Download failed (attempt ${attempt}). Retrying in ${delay}s..."
    sleep "$delay"
    delay=$((delay * 2))
  done
  error "Failed to download: $url after ${max_attempts} attempts."
  return 1
}

# -------------------------------------------------------------------------------------
# NOUVEAU BLACKLIST
# -------------------------------------------------------------------------------------
write_nouveau_blacklist() {
  cat > "${NOUVEAU_BLACKLIST}" << 'BLEOF'
blacklist nouveau
options nouveau modeset=0
BLEOF
}

blacklist_nouveau() {
  section "Blacklisting Nouveau Driver"
  if [[ -f "$NOUVEAU_BLACKLIST" ]]; then
    ok "nouveau already blacklisted"
    return
  fi
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] Would write ${NOUVEAU_BLACKLIST}"
    return
  fi
  write_nouveau_blacklist
  ok "nouveau blacklisted"
  update-initramfs -u -k all 2>/dev/null || true
}

# -------------------------------------------------------------------------------------
# SECURE BOOT CHECK
# -------------------------------------------------------------------------------------
check_secure_boot() {
  section "Checking Secure Boot"
  if ! command -v mokutil &>/dev/null; then
    skip "mokutil not available — skipping Secure Boot check"
    return
  fi
  local sb_state
  sb_state="$(mokutil --sb-state 2>/dev/null || echo "unknown")"
  if echo "$sb_state" | grep -qi "SecureBoot enabled"; then
    warn "Secure Boot is ENABLED."
    warn "NVIDIA DKMS modules must be signed to load."
    warn "If modules fail to load after install, enrol a MOK key:"
    warn "  openssl req -new -x509 -newkey rsa:2048 -keyout /root/mok.key"
    warn "    -out /root/mok.crt -days 3650 -subj /CN=NVIDIA-DKMS-MOK/ -nodes"
    warn "  mokutil --import /root/mok.crt"
    warn "  (reboot, enrol key in MOK manager, then reboot again)"
  else
    ok "Secure Boot: ${sb_state}"
  fi
}

# -------------------------------------------------------------------------------------
# INSTALL NVIDIA DRIVERS FROM TRIXIE-BACKPORTS
# -------------------------------------------------------------------------------------
install_nvidia_backports() {
  section "Installing NVIDIA Drivers (trixie-backports)"
  local pkgs_needed=()

  if dpkg -l pve-headers 2>/dev/null | grep -q '^ii'; then
    ok "pve-headers already installed"
  else
    pkgs_needed+=(pve-headers)
  fi

  if dpkg -l nvidia-kernel-dkms 2>/dev/null | grep -q '^ii'; then
    ok "nvidia-kernel-dkms already installed"
  else
    pkgs_needed+=(nvidia-kernel-dkms)
  fi

  if dpkg -l nvidia-driver 2>/dev/null | grep -q '^ii'; then
    ok "nvidia-driver already installed"
  else
    pkgs_needed+=(nvidia-driver)
  fi

  if dpkg -l nvidia-persistenced 2>/dev/null | grep -q '^ii'; then
    ok "nvidia-persistenced already installed"
  else
    pkgs_needed+=(nvidia-persistenced)
  fi

  if [[ ${#pkgs_needed[@]} -eq 0 ]]; then
    info "All NVIDIA driver packages already installed."
    return
  fi

  info "Installing from trixie-backports: ${pkgs_needed[*]}"
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] apt-get install -t trixie-backports -y ${pkgs_needed[*]}"
    return
  fi

  local attempt max_attempts=3 delay=5
  for ((attempt=1; attempt<=max_attempts; attempt++)); do
    info "Installing (attempt ${attempt}/${max_attempts})..."
    if DEBIAN_FRONTEND=noninteractive apt-get install -t trixie-backports -y -qq "${pkgs_needed[@]}"; then
      ok "NVIDIA driver packages installed from trixie-backports"
      return 0
    fi
    warn "Install failed (attempt ${attempt}). Retrying in ${delay}s..."
    apt_update
    sleep "$delay"
    delay=$((delay * 2))
  done
  error "Failed to install NVIDIA driver packages after ${max_attempts} attempts."
  error "Check: apt-get install -t trixie-backports nvidia-kernel-dkms nvidia-driver"
  return 1
}

# -------------------------------------------------------------------------------------
# DKMS STATUS CHECK + RUNNING-KERNEL SELF-HEAL
# -------------------------------------------------------------------------------------
build_dkms_all_kernels() {
  section "Verifying DKMS Build"
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] Would verify DKMS status and autoinstall for running kernel if needed"
    return
  fi

  local running_kernel
  running_kernel="$(uname -r)"
  info "Running kernel: ${running_kernel}"

  info "DKMS status:"
  local dkms_out
  dkms_out="$(dkms status 2>/dev/null || true)"
  if [[ -z "$dkms_out" ]]; then
    warn "No DKMS modules registered yet."
  else
    while IFS= read -r line; do
      if echo "$line" | grep -q "installed"; then
        ok "$line"
      else
        warn "$line"
      fi
    done <<< "$dkms_out"
  fi

  if echo "$dkms_out" | grep -F "$running_kernel" | grep -q "installed"; then
    ok "nvidia DKMS module installed for running kernel ${running_kernel}"
    return 0
  fi

  warn "No installed nvidia DKMS module for running kernel ${running_kernel}"
  warn "Attempting 'dkms autoinstall' to build it now..."
  if dkms autoinstall 2>&1 | tee /tmp/dkms_autoinstall.log; then
    if dkms status 2>/dev/null | grep -F "$running_kernel" | grep -q "installed"; then
      ok "DKMS module built for running kernel ${running_kernel}"
    else
      warn "autoinstall ran but module still not shown installed for ${running_kernel}"
      warn "If you just updated the kernel, REBOOT into it and re-run --check-only."
      warn "See /tmp/dkms_autoinstall.log for build details."
    fi
  else
    error "dkms autoinstall failed — see /tmp/dkms_autoinstall.log"
    warn "Ensure headers for ${running_kernel} are installed (proxmox-headers-*)."
  fi
}

# -------------------------------------------------------------------------------------
# MODULE AUTOLOAD
# -------------------------------------------------------------------------------------
write_module_autoload() {
  cat > "${MODULES_LOAD_FILE}" << 'MLEOF'
nvidia
nvidia_modeset
nvidia_drm
nvidia_uvm
MLEOF
}

configure_module_autoload() {
  section "Configuring Module Autoload"
  if [[ -f "$MODULES_LOAD_FILE" ]]; then
    ok "Module autoload already configured: $MODULES_LOAD_FILE"
    return
  fi
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] Would write ${MODULES_LOAD_FILE}"
    return
  fi
  write_module_autoload
  ok "Module autoload configured"
}

# -------------------------------------------------------------------------------------
# PILLAR 2 — DEVICE NODE STABILITY
# -------------------------------------------------------------------------------------
write_node_creation_script() {
  cat > "${NODE_SCRIPT}" << 'NODEEOF'
#!/bin/bash
# nvidia-setup-nodes.sh — called by nvidia-setup-nodes.service at sysinit
# Ensures all /dev/nvidia* device nodes exist with correct permissions.
# This runs BEFORE any LXC containers start (see Before= in service unit).

set -euo pipefail

log() { echo "[nvidia-setup-nodes] $*"; }

if ! lsmod | grep -q '^nvidia_uvm'; then
  log "Loading nvidia_uvm module..."
  if ! modprobe nvidia_uvm 2>/tmp/nvidia_uvm_modprobe.err; then
    log "ERROR: failed to load nvidia_uvm — see /tmp/nvidia_uvm_modprobe.err"
    cat /tmp/nvidia_uvm_modprobe.err || true
    exit 1
  fi
fi

for i in $(seq 1 10); do
  grep -q 'nvidia-uvm' /proc/devices 2>/dev/null && break
  sleep 1
done

UVM_MAJOR="$(awk '$2=="nvidia-uvm"{print $1}' /proc/devices || true)"

if [[ -z "${UVM_MAJOR}" ]]; then
  log "ERROR: nvidia-uvm major not found in /proc/devices after 10 seconds"
  exit 1
fi

log "nvidia-uvm major: ${UVM_MAJOR}"

if [[ ! -c /dev/nvidia-uvm ]]; then
  [[ -e /dev/nvidia-uvm ]] && rm -f /dev/nvidia-uvm
  mknod -m 0666 /dev/nvidia-uvm c "${UVM_MAJOR}" 0
  log "Created /dev/nvidia-uvm"
fi

if [[ ! -c /dev/nvidia-uvm-tools ]]; then
  [[ -e /dev/nvidia-uvm-tools ]] && rm -f /dev/nvidia-uvm-tools
  mknod -m 0666 /dev/nvidia-uvm-tools c "${UVM_MAJOR}" 1
  log "Created /dev/nvidia-uvm-tools"
fi

chmod 0666 /dev/nvidia* 2>/dev/null || true
chmod 0666 /dev/nvidia-uvm* 2>/dev/null || true

log "All nvidia device nodes verified."
NODEEOF
  chmod 755 "${NODE_SCRIPT}"
}

install_node_creation_script() {
  info "Installing node creation script: ${NODE_SCRIPT}"
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] Would write ${NODE_SCRIPT}"
    return
  fi
  write_node_creation_script
  ok "Node creation script installed"
}

write_node_service() {
  cat > "${NODE_SERVICE}" << 'SVCEOF'
[Unit]
Description=NVIDIA Device Node Setup (must run before LXC containers)
Documentation=https://github.com/braedach/homelab
After=systemd-modules-load.service
Before=lxc.service
Before=pve-container@.service
ConditionPathExists=/proc/devices
DefaultDependencies=no

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/local/sbin/nvidia-setup-nodes.sh
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=sysinit.target
SVCEOF
}

install_node_service() {
  info "Installing systemd service: ${NODE_SERVICE}"
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] Would write ${NODE_SERVICE}"
    return
  fi
  write_node_service
  ok "Service unit installed"
}

write_udev_rules() {
  cat > "${UVM_RULE_FILE}" << 'UDEVEOF'
# nvidia-uvm device permissions — managed by setup-gpu-pxe.sh
KERNEL=="nvidia-uvm",       MODE="0666"
KERNEL=="nvidia-uvm-tools", MODE="0666"
KERNEL=="nvidia*",          MODE="0666"
UDEVEOF
}

install_udev_rules() {
  info "Installing udev rules: ${UVM_RULE_FILE}"
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] Would write ${UVM_RULE_FILE}"
    return
  fi
  write_udev_rules
  ok "udev rules installed"
  udevadm control --reload-rules 2>/dev/null || true
  udevadm trigger 2>/dev/null || true
}

enable_nvidia_persistenced() {
  section "Enabling nvidia-persistenced"
  if ! systemctl list-unit-files nvidia-persistenced.service &>/dev/null; then
    warn "nvidia-persistenced.service not present — package may not be installed yet."
    warn "It is installed by install_nvidia_backports; re-run after a reboot if missing."
    return
  fi
  if systemctl is-enabled nvidia-persistenced &>/dev/null; then
    ok "nvidia-persistenced already enabled"
  else
    run "systemctl enable nvidia-persistenced 2>/dev/null || true"
    run "systemctl start  nvidia-persistenced 2>/dev/null || true"
    ok "nvidia-persistenced enabled"
  fi
}

setup_device_stability() {
  section "Setting Up Device Node Stability (Pillar 2)"
  install_node_creation_script
  install_node_service
  install_udev_rules

  if [[ "${DRY_RUN}" -eq 0 ]]; then
    systemctl daemon-reload
    systemctl enable nvidia-setup-nodes.service
    systemctl restart nvidia-setup-nodes.service || true
    ok "nvidia-setup-nodes.service enabled and started"
  else
    info "[dry-run] systemctl daemon-reload && systemctl enable nvidia-setup-nodes.service"
  fi

  enable_nvidia_persistenced
}

# -------------------------------------------------------------------------------------
# MODULE LOAD + VERIFICATION
# -------------------------------------------------------------------------------------
warm_up_gpu() {
  # Touch the GPU so any idle-unloaded modules reload before we sample lsmod.
  if command -v nvidia-smi &>/dev/null; then
    nvidia-smi >/dev/null 2>&1 || true
  fi
}

load_and_verify_modules() {
  section "Loading and Verifying NVIDIA Modules"
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] Would load and verify nvidia modules"
    return
  fi

  local modules=(nvidia nvidia_modeset nvidia_drm nvidia_uvm)
  local failed=()
  for mod in "${modules[@]}"; do
    if lsmod | grep -q "^${mod}[[:space:]]"; then
      ok "Module loaded: $mod"
    else
      info "Loading module: $mod"
      if modprobe "$mod" 2>/tmp/modprobe_err_${mod}.log; then
        ok "Module loaded: $mod"
      else
        fail "Module failed: $mod (see /tmp/modprobe_err_${mod}.log)"
        failed+=("$mod")
      fi
    fi
  done

  if [[ ${#failed[@]} -gt 0 ]]; then
    warn "Some modules failed to load: ${failed[*]}"
    warn "This is expected if a reboot is needed after DKMS build."
    warn "Reboot and re-run with --check-only to verify."
  fi
}

run_nvidia_smi_check() {
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] Would run nvidia-smi"
    return
  fi
  section "nvidia-smi Verification"
  if command -v nvidia-smi &>/dev/null; then
    if nvidia-smi; then
      ok "nvidia-smi successful"
    else
      warn "nvidia-smi returned an error — reboot may be required"
    fi
  else
    warn "nvidia-smi not found — driver may need a reboot to activate"
  fi
}

# -------------------------------------------------------------------------------------
# BOOT PIN HELPERS
# -------------------------------------------------------------------------------------
get_pinned_kernel() {
  if command -v proxmox-boot-tool &>/dev/null; then
    proxmox-boot-tool kernel list 2>/dev/null \
      | awk '/^Pinned kernel:/{getline; gsub(/^[ \t]+/,""); print; exit}'
  fi
}

pin_running_kernel() {
  section "Pinning Boot Kernel (Guard Layer 3)"
  local rk; rk="$(uname -r)"

  if ! command -v proxmox-boot-tool &>/dev/null; then
    warn "proxmox-boot-tool not found — cannot auto-pin."
    warn "Ensure your bootloader defaults to ${rk} manually."
    return
  fi

  # Never pin a kernel that has no built NVIDIA module.
  if ! dkms status 2>/dev/null | grep -F "$rk" | grep -q installed; then
    warn "Running kernel ${rk} has no installed nvidia DKMS module — NOT pinning."
    warn "Resolve the driver build first, then pin manually:"
    warn "  proxmox-boot-tool kernel pin ${rk}"
    return
  fi

  local current_pin; current_pin="$(get_pinned_kernel || true)"
  if [[ "$current_pin" == "$rk" ]]; then
    ok "Boot kernel already pinned to ${rk}"
    return
  fi

  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] proxmox-boot-tool kernel pin ${rk}"
    return
  fi

  info "Pinning running kernel: ${rk}"
  proxmox-boot-tool kernel pin "${rk}" || warn "Pin command returned an error."
  ok "Boot kernel pinned to ${rk}"
}

# -------------------------------------------------------------------------------------
# GUARD LAYER 1 — PREFLIGHT GATE
# -------------------------------------------------------------------------------------
# Returns 0 if safe, 2 if an unapproved kernel series would be installed.
do_preflight() {
  section "Preflight — Kernel Series Guard (Layer 1)"
  info "Approved kernel series: ${APPROVED_KERNEL_SERIES[*]}"
  info "Simulating: apt-get full-upgrade ..."

  local sim
  sim="$(LANG=C apt-get -s full-upgrade 2>/dev/null || true)"

  local incoming
  incoming="$(echo "$sim" | awk '/^Inst /{print $2}' || true)"

  local kernel_incoming=() bad=() selector_moving=0
  while IFS= read -r pkg; do
    [[ -z "$pkg" ]] && continue
    case "$pkg" in
      proxmox-default-kernel|proxmox-default-headers) selector_moving=1 ;;
    esac
    if [[ "$pkg" == proxmox-kernel-* || "$pkg" == proxmox-headers-* ]]; then
      kernel_incoming+=("$pkg")
      local s; s="$(kernel_series_from_pkg "$pkg")"
      if [[ -n "$s" ]] && ! is_approved_series "$s"; then
        bad+=("$pkg (series ${s})")
      fi
    fi
  done <<< "$incoming"

  if [[ ${#kernel_incoming[@]} -gt 0 ]]; then
    info "Kernel/header packages this upgrade would install:"
    for p in "${kernel_incoming[@]}"; do info "  $p"; done
  else
    ok "No kernel or header packages in this upgrade."
  fi

  if [[ "$selector_moving" -eq 1 ]]; then
    warn "proxmox-default-kernel/headers would change — the DEFAULT series is moving."
    warn "This is exactly how the 7.0 series jump arrived. Inspect carefully."
  fi

  if [[ ${#bad[@]} -gt 0 ]]; then
    echo ""
    error "=================================================================="
    error " BLOCKED — upgrade would install an UNAPPROVED kernel series"
    error "=================================================================="
    for b in "${bad[@]}"; do error "  $b"; done
    error ""
    error "Approved series: ${APPROVED_KERNEL_SERIES[*]}"
    error "An unapproved series will FAIL the NVIDIA DKMS build, can wedge dpkg,"
    error "and can leave a bootable-but-broken kernel as the GRUB default."
    error ""
    error "DO NOT run 'apt full-upgrade' until you have either:"
    error "  (a) applied the guard:   sudo $SCRIPT_NAME --apply-guard"
    error "      (holds the offending packages so apt keeps them back), OR"
    error "  (b) CONFIRMED NVIDIA supports the new series, then approved it"
    error "      by editing APPROVED_KERNEL_SERIES and running --remove-guard."
    error "=================================================================="
    return 2
  fi

  ok "Preflight PASSED — no unapproved kernel series incoming."
  ok "Safe to proceed with: apt full-upgrade"
  return 0
}

# -------------------------------------------------------------------------------------
# GUARD LAYER 2 — APT HOLDS  (+ triggers Layer 3 pin)
# -------------------------------------------------------------------------------------
build_hold_list() {
  # Echo (one per line) the packages that should be held:
  #   - the selector meta-packages (always)
  #   - any installed kernel/header packages from an unapproved series
  local p
  for p in "${SELECTOR_PACKAGES[@]}"; do echo "$p"; done
  while IFS= read -r pkg; do
    [[ -z "$pkg" ]] && continue
    local s; s="$(kernel_series_from_pkg "$pkg")"
    [[ -z "$s" ]] && continue
    if ! is_approved_series "$s"; then echo "$pkg"; fi
  done < <(dpkg-query -W -f='${Package}\n' 'proxmox-kernel-*' 'proxmox-headers-*' 2>/dev/null | sort -u || true)
}

apply_kernel_guard() {
  section "Applying Kernel Series Guard (Layers 2 + 3)"
  info "Approved kernel series: ${APPROVED_KERNEL_SERIES[*]}"

  local raw uniq=() seen=" "
  raw="$(build_hold_list)"
  while IFS= read -r p; do
    [[ -z "$p" ]] && continue
    if [[ "$seen" != *" $p "* ]]; then uniq+=("$p"); seen+="$p "; fi
  done <<< "$raw"

  if [[ ${#uniq[@]} -eq 0 ]]; then
    warn "No packages resolved for holding (unexpected)."
  else
    info "Packages to hold (apt will keep these back on upgrade):"
    for p in "${uniq[@]}"; do info "  $p"; done
    if [[ "${DRY_RUN}" -eq 1 ]]; then
      info "[dry-run] apt-mark hold ${uniq[*]}"
    else
      apt-mark hold "${uniq[@]}" || warn "Some holds may have failed (package not installed?)."
      ok "Holds applied. 'proxmox-default-kernel' will now show as kept-back."
      ok "6.17 security point-updates STILL flow (proxmox-kernel-6.17 is not held)."
    fi
  fi

  pin_running_kernel
}

# -------------------------------------------------------------------------------------
# GUARD — REMOVE
# -------------------------------------------------------------------------------------
remove_kernel_guard() {
  section "Removing Kernel Series Guard"
  warn "This lifts the apt holds that protect you from an unsupported kernel series."
  warn "Only do this once you have CONFIRMED NVIDIA builds against the new series"
  warn "AND added that series to APPROVED_KERNEL_SERIES."
  echo ""

  local held
  held="$(apt-mark showhold 2>/dev/null | grep -E '^proxmox-(default|kernel|headers)-' || true)"
  if [[ -z "$held" ]]; then
    ok "No matching proxmox kernel/header holds present."
    return
  fi

  info "Currently held:"
  echo "$held" | while read -r p; do info "  $p"; done
  echo ""

  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] apt-mark unhold ${held//$'\n'/ }"
    return
  fi

  read -r -p "Type 'unhold' to confirm lifting these holds: " confirm
  [[ "$confirm" == "unhold" ]] || { info "Cancelled — holds left in place."; return; }

  # shellcheck disable=SC2086
  apt-mark unhold $held || warn "Some unholds may have failed."
  ok "Holds removed. Re-run --preflight BEFORE upgrading."
  info "The boot-kernel pin (if set) is unchanged — manage via proxmox-boot-tool."
}

# -------------------------------------------------------------------------------------
# GUARD — STATUS
# -------------------------------------------------------------------------------------
guard_status() {
  section "Kernel Guard Status"

  echo ""
  echo "-- Approved Kernel Series -------------------------------"
  info "${APPROVED_KERNEL_SERIES[*]}"

  echo ""
  echo "-- Running Kernel ---------------------------------------"
  info "$(uname -r)"

  echo ""
  echo "-- apt Holds (Layer 2) ----------------------------------"
  local holds
  holds="$(apt-mark showhold 2>/dev/null | grep -E '^proxmox-(default|kernel|headers)-' || true)"
  if [[ -n "$holds" ]]; then
    echo "$holds" | while read -r p; do ok "$p [held]"; done
  else
    fail "No proxmox kernel/header holds set — guard Layer 2 NOT active."
    warn "Apply with: sudo $SCRIPT_NAME --apply-guard"
  fi

  echo ""
  echo "-- Boot Kernel Pin (Layer 3) ----------------------------"
  if command -v proxmox-boot-tool &>/dev/null; then
    local pinned; pinned="$(get_pinned_kernel || true)"
    if [[ -n "$pinned" && "$pinned" != "None." ]]; then
      ok "Pinned kernel: ${pinned}"
      [[ "$pinned" == "$(uname -r)" ]] || warn "Pinned kernel differs from running kernel."
    else
      fail "No boot kernel pinned — guard Layer 3 NOT active."
      warn "Apply with: sudo $SCRIPT_NAME --apply-guard"
    fi
  else
    skip "proxmox-boot-tool not available — cannot report pin state."
  fi

  echo ""
  echo "-- Installed Proxmox Kernels ----------------------------"
  local pk
  while IFS= read -r pk; do
    [[ -z "$pk" ]] && continue
    local s; s="$(kernel_series_from_pkg "$pk")"
    if [[ -n "$s" ]] && is_approved_series "$s"; then
      ok "$pk  (series ${s}, approved)"
    elif [[ -n "$s" ]]; then
      warn "$pk  (series ${s}, NOT approved — should be held)"
    fi
  done < <(dpkg-query -W -f='${Package}\n' 'proxmox-kernel-[0-9]*' 2>/dev/null | sort -u || true)
}

# -------------------------------------------------------------------------------------
# PILLAR 3 — LXC CONTAINER CONFIGURATION
# -------------------------------------------------------------------------------------
print_lxc_config() {
  local vmid="${1:-<VMID>}"
  echo ""
  echo "======================================================="
  echo "  LXC GPU Passthrough Configuration"
  echo "  Container: ${vmid}"
  echo "  File: /etc/pve/lxc/${vmid}.conf"
  echo "======================================================="
  echo ""
  echo "# NVIDIA GPU passthrough — generated by setup-gpu-pxe.sh v${SCRIPT_VERSION}"
  echo "# Uses Proxmox 8.1+ dev* syntax (handles device type detection automatically)"
  echo "# gid=44 is the 'video' group on Debian-based systems"
  echo "# Verify with: getent group video  (inside the container)"
  echo "#"
  echo "dev0: /dev/nvidia0,gid=${LXC_VIDEO_GID}"
  echo "dev1: /dev/nvidiactl,gid=${LXC_VIDEO_GID}"
  echo "dev2: /dev/nvidia-modeset,gid=${LXC_VIDEO_GID}"
  echo "dev3: /dev/nvidia-uvm,gid=${LXC_VIDEO_GID}"
  echo "dev4: /dev/nvidia-uvm-tools,gid=${LXC_VIDEO_GID}"
  echo "dev5: /dev/nvidia-caps/nvidia-cap1,gid=${LXC_VIDEO_GID}"
  echo "dev6: /dev/nvidia-caps/nvidia-cap2,gid=${LXC_VIDEO_GID}"
  echo ""
  echo "# Also recommended: container startup delay to allow host nodes to settle"
  echo "# startup: order=2,up=15"
  echo ""
  echo "======================================================="
  echo ""
  echo "NOTES:"
  echo "  1. Do NOT install the full NVIDIA driver package inside the container."
  echo "     Only install userspace libraries (libnvidia-compute-*) if needed."
  echo "  2. The dev* directive handles cgroup2 permissions automatically."
  echo "     You do NOT need separate lxc.cgroup2.devices.allow lines."
  echo "  3. If you have multiple GPUs (e.g. dual RTX), increment the dev* index"
  echo "     and add: dev7: /dev/nvidia1,gid=${LXC_VIDEO_GID}"
  echo "  4. After applying config, restart the container:"
  echo "     pct stop ${vmid} && pct start ${vmid}"
  echo "  5. Verify inside the container:"
  echo "     ls -la /dev/nvidia*"
  echo "     nvidia-smi"
  echo ""
}

# -------------------------------------------------------------------------------------
# PURGE
# -------------------------------------------------------------------------------------
do_purge() {
  section "Purging All NVIDIA Components"
  warn "This will remove ALL NVIDIA packages and configuration."
  read -r -p "Are you sure? (yes/no): " confirm
  [[ "$confirm" == "yes" ]] || { info "Purge cancelled."; exit 0; }

  run "systemctl stop nvidia-setup-nodes.service 2>/dev/null || true"
  run "systemctl disable nvidia-setup-nodes.service 2>/dev/null || true"
  run "systemctl stop nvidia-persistenced 2>/dev/null || true"
  run "systemctl disable nvidia-persistenced 2>/dev/null || true"

  run "rm -f '${NODE_SERVICE}' '${NODE_SCRIPT}' '${UVM_RULE_FILE}' '${MODULES_LOAD_FILE}' '${NOUVEAU_BLACKLIST}'"

  if [[ "${DRY_RUN}" -eq 0 ]]; then
    systemctl daemon-reload
    udevadm control --reload-rules 2>/dev/null || true
  fi

  local nvidia_pkgs
  nvidia_pkgs="$(dpkg -l '*nvidia*' '*libnvidia*' 2>/dev/null | awk '/^ii/{print $2}' || true)"

  if [[ -n "$nvidia_pkgs" ]]; then
    info "Removing packages:"
    echo "$nvidia_pkgs" | while read -r p; do info "  $p"; done
    if [[ "${DRY_RUN}" -eq 0 ]]; then
      # shellcheck disable=SC2086
      DEBIAN_FRONTEND=noninteractive apt-get purge -y $nvidia_pkgs || true
      DEBIAN_FRONTEND=noninteractive apt-get autoremove -y || true
    else
      info "[dry-run] apt-get purge -y ${nvidia_pkgs//$'\n'/ }"
    fi
  fi

  ok "Purge complete. You may want to reboot."
  info "Note: --purge does NOT touch kernel guard holds. Use --remove-guard for those."
}

# -------------------------------------------------------------------------------------
# HEALTH CHECK / --check-only
# -------------------------------------------------------------------------------------
do_check() {
  section "NVIDIA System Health Check"

  # Warm the GPU first so idle-unloaded modules reload before we sample state.
  warm_up_gpu
  local smi_ok=0
  if command -v nvidia-smi &>/dev/null && nvidia-smi &>/dev/null; then
    smi_ok=1
  fi

  echo ""
  echo "-- Running Kernel ---------------------------------------"
  info "$(uname -r)"

  echo ""
  echo "-- Kernel Modules ---------------------------------------"
  for mod in nvidia nvidia_modeset nvidia_drm nvidia_uvm; do
    if lsmod | grep -q "^${mod}[[:space:]]"; then
      ok "$mod loaded"
    elif [[ "$smi_ok" -eq 1 ]]; then
      skip "$mod not currently loaded (driver functional; loads on access)"
    else
      fail "$mod NOT loaded"
    fi
  done

  echo ""
  echo "-- Device Nodes -----------------------------------------"
  for dev in /dev/nvidia0 /dev/nvidiactl /dev/nvidia-modeset /dev/nvidia-uvm /dev/nvidia-uvm-tools; do
    if [[ -c "$dev" ]]; then
      local info_str; info_str="$(ls -la "$dev" 2>/dev/null)"
      ok "$dev  ->  $info_str"
    elif [[ -e "$dev" ]]; then
      fail "$dev EXISTS but is NOT a character device (stub file — timing bug)"
    else
      fail "$dev MISSING"
    fi
  done
  for cap in /dev/nvidia-caps/nvidia-cap1 /dev/nvidia-caps/nvidia-cap2; do
    if [[ -c "$cap" ]]; then
      ok "$cap  ->  $(ls -la "$cap" 2>/dev/null)"
    else
      skip "$cap not found (normal on GPUs without MIG/caps support)"
    fi
  done

  echo ""
  echo "-- systemd Services -------------------------------------"
  for svc in nvidia-setup-nodes.service nvidia-persistenced.service; do
    local state; state="$(systemctl is-active "$svc" 2>/dev/null || echo "inactive")"
    local enabled; enabled="$(systemctl is-enabled "$svc" 2>/dev/null || echo "disabled")"
    if [[ "$state" == "active" && "$enabled" == "enabled" ]]; then
      ok "$svc  [active/enabled]"
    else
      fail "$svc  [${state}/${enabled}]"
    fi
  done

  echo ""
  echo "-- DKMS Status (running kernel = $(uname -r)) -----------"
  if command -v dkms &>/dev/null; then
    local rk; rk="$(uname -r)"
    dkms status | while IFS= read -r line; do
      if echo "$line" | grep -q "installed"; then
        ok "$line"
      else
        fail "$line"
      fi
    done
    if dkms status 2>/dev/null | grep -F "$rk" | grep -q "installed"; then
      ok "nvidia module present for running kernel ${rk}"
    else
      fail "no installed nvidia module for running kernel ${rk} — run a full install or --force-rebuild"
    fi
  else
    skip "dkms not installed"
  fi

  echo ""
  echo "-- nvidia-smi -------------------------------------------"
  if command -v nvidia-smi &>/dev/null; then
    nvidia-smi || warn "nvidia-smi failed"
  else
    fail "nvidia-smi not found"
  fi

  echo ""
  echo "-- Kernel Guard -----------------------------------------"
  local holds pin
  holds="$(apt-mark showhold 2>/dev/null | grep -E '^proxmox-(default|kernel|headers)-' || true)"
  if [[ -n "$holds" ]]; then
    ok "apt holds active (Layer 2):"
    echo "$holds" | while read -r p; do ok "  $p"; done
  else
    fail "No proxmox kernel/header apt holds — guard Layer 2 NOT active (run --apply-guard)"
  fi
  if command -v proxmox-boot-tool &>/dev/null; then
    pin="$(get_pinned_kernel || true)"
    if [[ -n "$pin" && "$pin" != "None." ]]; then
      ok "Boot kernel pinned (Layer 3): ${pin}"
    else
      fail "No boot kernel pinned — guard Layer 3 NOT active (run --apply-guard)"
    fi
  fi

  echo ""
  echo "-- Installed Files --------------------------------------"
  for f in "$NODE_SERVICE" "$NODE_SCRIPT" "$UVM_RULE_FILE" "$NOUVEAU_BLACKLIST" "$MODULES_LOAD_FILE"; do
    if [[ -f "$f" ]]; then
      ok "$f"
    else
      fail "$f MISSING"
    fi
  done

  echo ""
  print_lxc_config "YOUR_VMID"
}

# -------------------------------------------------------------------------------------
# FORCE REBUILD
# -------------------------------------------------------------------------------------
do_force_rebuild() {
  section "Force Rebuilding DKMS Modules"
  if [[ "${DRY_RUN}" -eq 1 ]]; then
    info "[dry-run] dkms autoinstall --force"
    return
  fi
  info "Forcing DKMS rebuild for all kernels..."
  dkms autoinstall --force 2>&1 | tee /tmp/dkms_rebuild.log || true
  ok "Force rebuild complete. Check /tmp/dkms_rebuild.log for details."
}

# -------------------------------------------------------------------------------------
# HELP
# -------------------------------------------------------------------------------------
show_help() {
  cat <<EOF

${BOLD}setup-gpu-pxe.sh v${SCRIPT_VERSION}${RESET}
Proxmox VE NVIDIA Driver Host Installer + LXC GPU Configurator + Kernel Guard

${BOLD}USAGE${RESET}
  sudo $SCRIPT_NAME [OPTION]

${BOLD}INSTALL / VERIFY${RESET}
  (none)              Full install — drivers, services, nodes, THEN applies guard
  --check-only        System + GPU health (also reports guard status)
  --force-rebuild     Force DKMS module rebuild (after a kernel update)
  --lxc-config [ID]   Print LXC container config snippet (optional VMID)

${BOLD}KERNEL GUARD${RESET}
  --preflight         Simulate full-upgrade; BLOCK if an unapproved series is
                      incoming. Run this BEFORE 'apt full-upgrade'.
  --apply-guard       apt-mark hold the series selector + unapproved-series
                      packages, then pin the running kernel.
  --guard-status      Show approved series, apt holds, and boot pin.
  --remove-guard      Lift the apt holds (after approving a new series).

${BOLD}MAINTENANCE${RESET}
  --purge             Remove all NVIDIA components (does not touch holds)
  --dry-run           Preview any action without making changes
  --help              Show this help

${BOLD}APPROVED KERNEL SERIES${RESET}  (edit APPROVED_KERNEL_SERIES near the top)
  ${APPROVED_KERNEL_SERIES[*]}

${BOLD}KERNEL UPDATE SOP (manual, production)${RESET}
  1. apt update
  2. sudo $SCRIPT_NAME --preflight      <-- hard gate; stop if blocked
  3. apt full-upgrade                   (full-upgrade, NOT upgrade)
  4. dkms status                        confirm new kernel = installed
  5. reboot
  6. sudo $SCRIPT_NAME --check-only
  7. sudo $SCRIPT_NAME --apply-guard    re-pin the now-running kernel
  8. start passthrough LXCs; nvidia-smi inside each

EOF
}

# -------------------------------------------------------------------------------------
# ARGUMENT PARSING
# -------------------------------------------------------------------------------------
parse_args() {
  while [[ $# -gt 0 ]]; do
    case "$1" in
      --dry-run)        DRY_RUN=1 ;;
      --purge)          PURGE=1 ;;
      --force-rebuild)  FORCE_REBUILD=1 ;;
      --check-only)     CHECK_ONLY=1 ;;
      --preflight)      PREFLIGHT_ONLY=1 ;;
      --apply-guard)    APPLY_GUARD_ONLY=1 ;;
      --remove-guard)   REMOVE_GUARD_ONLY=1 ;;
      --guard-status)   GUARD_STATUS_ONLY=1 ;;
      --lxc-config)
        LXC_CONFIG_ONLY=1
        if [[ $# -gt 1 && "$2" =~ ^[0-9]+$ ]]; then
          LXC_VMID="$2"; shift
        fi
        ;;
      --help|-h)        show_help; exit 0 ;;
      *)
        error "Unknown argument: $1"
        show_help
        exit 1
        ;;
    esac
    shift
  done
}

# -------------------------------------------------------------------------------------
# CLEANUP TRAP
# -------------------------------------------------------------------------------------
cleanup() {
  local exit_code=$?
  if [[ $exit_code -ne 0 && $exit_code -ne 2 ]]; then
    warn "Script exited with code ${exit_code}."
    warn "Partial installation may have occurred."
    warn "Check the output above, then re-run or use --purge to reset."
  fi
}
trap cleanup EXIT

# -------------------------------------------------------------------------------------
# MAIN
# -------------------------------------------------------------------------------------
main() {
  parse_args "$@"
  check_root "$@"

  echo ""
  echo -e "${BOLD}${CYAN}setup-gpu-pxe.sh v${SCRIPT_VERSION} — Proxmox VE NVIDIA Driver Installer + Kernel Guard${RESET}"
  echo -e "Target: Proxmox VE 9.x / Debian 13 (Trixie)"
  echo -e "Approved kernel series: ${APPROVED_KERNEL_SERIES[*]}"
  [[ "${DRY_RUN}" -eq 1 ]] && echo -e "${YELLOW}*** DRY RUN MODE — no changes will be made ***${RESET}"
  echo ""

  # Short-circuit modes
  if [[ "${LXC_CONFIG_ONLY}" -eq 1 ]]; then
    print_lxc_config "${LXC_VMID}"; exit 0
  fi
  if [[ "${PREFLIGHT_ONLY}" -eq 1 ]]; then
    do_preflight || exit $?     # exit 2 on block
    exit 0
  fi
  if [[ "${GUARD_STATUS_ONLY}" -eq 1 ]]; then
    guard_status; exit 0
  fi
  if [[ "${APPLY_GUARD_ONLY}" -eq 1 ]]; then
    apply_kernel_guard; exit 0
  fi
  if [[ "${REMOVE_GUARD_ONLY}" -eq 1 ]]; then
    remove_kernel_guard; exit 0
  fi
  if [[ "${CHECK_ONLY}" -eq 1 ]]; then
    do_check; exit 0
  fi
  if [[ "${PURGE}" -eq 1 ]]; then
    do_purge; exit 0
  fi
  if [[ "${FORCE_REBUILD}" -eq 1 ]]; then
    do_force_rebuild; exit 0
  fi

  # Full install sequence
  check_prerequisites
  detect_nvidia_gpu
  check_secure_boot
  blacklist_nouveau
  ensure_debian_sources         # non-free + trixie-backports
  install_nvidia_backports      # nvidia-kernel-dkms + nvidia-driver + persistenced
  build_dkms_all_kernels        # verify DKMS + self-heal running kernel
  configure_module_autoload
  setup_device_stability        # Pillar 2 — boot ordering + persistence
  load_and_verify_modules
  run_nvidia_smi_check
  apply_kernel_guard            # Guard Layers 2 + 3

  section "Installation Complete"
  echo ""
  ok "NVIDIA drivers installed (trixie-backports)"
  ok "nvidia-setup-nodes.service enabled (runs before LXC at boot)"
  ok "nvidia-persistenced installed and enabled"
  ok "Kernel guard applied (apt holds + boot pin)"
  echo ""
  info "Note: a normal re-run VERIFIES the DKMS build and rebuilds only if the"
  info "running kernel lacks a module. Before EVERY 'apt full-upgrade', run:"
  info "  sudo $SCRIPT_NAME --preflight"
  echo ""
  echo -e "${BOLD}NEXT STEPS:${RESET}"
  echo "  1. Reboot the Proxmox host (if this was a fresh driver install)"
  echo "  2. Verify: sudo $SCRIPT_NAME --check-only"
  echo "  3. Get container config: sudo $SCRIPT_NAME --lxc-config <VMID>"
  echo "  4. Apply config to /etc/pve/lxc/<VMID>.conf"
  echo "  5. Restart containers: pct stop <VMID> && pct start <VMID>"
  echo ""
  print_lxc_config "YOUR_VMID"
}

main "$@"

This should work fine but you are advised I have a dated card that with drivers that are not supported on version 7 of the new Kernel.

#enoughsaid