Projects

🧠 Memory in Podman

Leon SCOTT

03 Jan 2026 • 3 min read

Podman

How to Set Limits, Control Resource Usage, and Monitor Containers Effectively

Podman provides fine‑grained control over container memory usage using cgroups v2. This is especially important on systems where containers run inside an LXC or other constrained environments. Without limits, some workloads—particularly machine‑learning or GPU‑accelerated services—can consume far more memory than expected and starve the host.

This guide explains how Podman handles memory, how to set limits, and how to monitor and tune containers. Examples are provided using a machine‑learning service (such as the Immich ML CUDA container), but the principles apply to any Podman‑managed workload.

📌 Why Memory Limits Matter

By default, Podman containers can use as much memory as the host allows. This can cause:

Host page cache ballooning
OOM kills inside the LXC
Unpredictable performance
GPU‑accelerated containers consuming 4–12 GB RAM
Databases or Node apps growing without bound

Setting explicit memory limits ensures predictability, stability, and fair resource allocation across your container fleet.

🧩 Setting Memory Limits in Podman

Podman supports the same memory flags as Docker, but enforces them using cgroups v2:

Podman CLI

podman run \
  --memory=2g \
  --memory-swap=2g \
  myimage:latest

--memory sets the hard RAM limit
--memory-swap sets the combined RAM+swap limit
Setting both equal prevents swap thrashing

🧩 Setting Memory Limits in Podman Compose

Podman Compose supports the deploy.resources.limits.memory syntax.

Example (generic service)

services:
  myservice:
    image: myimage:latest
    deploy:
      resources:
        limits:
          memory: 1g

This works correctly under Podman as long as the host supports cgroup delegation (Proxmox LXC does).

🧩 Example: Limiting a Heavy ML Container (Immich ML CUDA)

GPU‑accelerated ML containers often load large models into memory. Without limits, they may consume 6–12 GB RAM.

Here’s a Podman‑aligned Compose example that constrains memory safely:

services:
  immich-machine-learning:
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION}-cuda
    container_name: immich_ML
    restart: always

    environment:
      - TZ=${TZ}
      - MODEL_CACHE_SIZE=1G   # optional: restrict model cache growth

    volumes:
      - model-cache:/cache:U

    devices:
      - nvidia.com/gpu=all

    deploy:
      resources:
        limits:
          memory: 4g   # realistic for CUDA workloads

If using the CPU‑only image, a 2 GB limit is usually sufficient.

🧪 Monitoring Memory Usage in Podman

Podman provides several tools to inspect real‑time memory behavior.

1. Podman stats

Shows live memory usage per container:

podman stats

Look for:

Memory plateauing near the limit
Containers creeping upward over time - beszel or other monitoring tools very handy
Spikes during inference or heavy load

2. Inspecting a single container

podman stats immich_ML

Useful for verifying that limits are being enforced.

3. Checking host memory and page cache

free -h

If the page cache no longer sits at 99%, your limits are working.

4. Checking model cache or application cache

Example for Immich ML:

podman exec immich_ML du -sh /cache

This helps confirm that cache‑size environment variables are effective.

🧩 Practical Guidelines for Setting Limits

Lightweight services (web apps, dashboards, agents)

128–512 MB is usually enough
Example: Homepage, Portainer, Beszel, IT‑Tools

Databases (Postgres, MariaDB, MySQL)

Avoid strict limits unless necessary
DBs rely heavily on host page cache
If limiting, start at 1–2 GB

Node/Python apps

256–1024 MB depending on workload

GPU‑accelerated ML containers

CPU mode: 1–2 GB
CUDA mode: 4–6 GB minimum
TensorRT models can spike during load

🧩 Detecting When Limits Are Too Tight

If a container hits its memory limit, Podman will OOM‑kill the process. Signs include:

Container restarting unexpectedly
Logs showing memory allocation failures
podman ps showing recent restarts

If this happens, increase the limit slightly and retest.

🧩 Summary

Memory management in Podman is straightforward once you understand how cgroups v2 enforce limits. By setting explicit memory caps and monitoring usage with podman stats, you can prevent runaway containers from destabilizing the host.

Key takeaways:

Use deploy.resources.limits.memory in Compose
Use --memory and --memory-swap in CLI
Monitor with podman stats
GPU workloads require higher limits
Cache‑heavy services benefit from environment‑level caps

With these tools, you can keep your Podman environment predictable, stable, and well‑behaved—even when running heavy machine‑learning workloads.

Every now and then I throw the results of step 1 to the AI and get it to assess the containers. It will then give me something like this, depending on how you phrased the question.

🧩 Final Verdict

Here’s the forensic summary:

Container	Status	Action
Immich ML	Fixed	✔ No further action
Immich Server	Normal	✔
Databases	Normal	✔
Redis	Normal	✔
Jellyfin	Very low	✔
Ollama	Idle	✔
All others	Normal	✔

There are no other containers that need memory limits or tuning.

Leon, this is a beautiful container landscape — and the good news is that nothing else is misbehaving. In fact, your entire fleet is running absurdly lean

Your Podman LXC is now stable, predictable, and well‑behaved.

Live and learn.

#enoughsaid