đź§ Memory in Podman
How to Set Limits, Control Resource Usage, and Monitor Containers Effectively
Podman provides fine‑grained control over container memory usage using cgroups v2. This is especially important on systems where containers run inside an LXC or other constrained environments. Without limits, some workloads—particularly machine‑learning or GPU‑accelerated services—can consume far more memory than expected and starve the host.
This guide explains how Podman handles memory, how to set limits, and how to monitor and tune containers. Examples are provided using a machine‑learning service (such as the Immich ML CUDA container), but the principles apply to any Podman‑managed workload.
📌 Why Memory Limits Matter
By default, Podman containers can use as much memory as the host allows. This can cause:
- Host page cache ballooning
- OOM kills inside the LXC
- Unpredictable performance
- GPU‑accelerated containers consuming 4–12 GB RAM
- Databases or Node apps growing without bound
Setting explicit memory limits ensures predictability, stability, and fair resource allocation across your container fleet.
đź§© Setting Memory Limits in Podman
Podman supports the same memory flags as Docker, but enforces them using cgroups v2:
Podman CLI
podman run \
--memory=2g \
--memory-swap=2g \
myimage:latest
--memorysets the hard RAM limit--memory-swapsets the combined RAM+swap limit- Setting both equal prevents swap thrashing
đź§© Setting Memory Limits in Podman Compose
Podman Compose supports the deploy.resources.limits.memory syntax.
Example (generic service)
services:
myservice:
image: myimage:latest
deploy:
resources:
limits:
memory: 1g
This works correctly under Podman as long as the host supports cgroup delegation (Proxmox LXC does).
đź§© Example: Limiting a Heavy ML Container (Immich ML CUDA)
GPU‑accelerated ML containers often load large models into memory. Without limits, they may consume 6–12 GB RAM.
Here’s a Podman‑aligned Compose example that constrains memory safely:
services:
immich-machine-learning:
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION}-cuda
container_name: immich_ML
restart: always
environment:
- TZ=${TZ}
- MODEL_CACHE_SIZE=1G # optional: restrict model cache growth
volumes:
- model-cache:/cache:U
devices:
- nvidia.com/gpu=all
deploy:
resources:
limits:
memory: 4g # realistic for CUDA workloads
If using the CPU‑only image, a 2 GB limit is usually sufficient.
đź§Ş Monitoring Memory Usage in Podman
Podman provides several tools to inspect real‑time memory behavior.
1. Podman stats
Shows live memory usage per container:
podman stats
Look for:
- Memory plateauing near the limit
- Containers creeping upward over time - beszel or other monitoring tools very handy
- Spikes during inference or heavy load
2. Inspecting a single container
podman stats immich_ML
Useful for verifying that limits are being enforced.
3. Checking host memory and page cache
free -h
If the page cache no longer sits at 99%, your limits are working.
4. Checking model cache or application cache
Example for Immich ML:
podman exec immich_ML du -sh /cache
This helps confirm that cache‑size environment variables are effective.
đź§© Practical Guidelines for Setting Limits
Lightweight services (web apps, dashboards, agents)
- 128–512 MB is usually enough
- Example: Homepage, Portainer, Beszel, IT‑Tools
Databases (Postgres, MariaDB, MySQL)
- Avoid strict limits unless necessary
- DBs rely heavily on host page cache
- If limiting, start at 1–2 GB
Node/Python apps
- 256–1024 MB depending on workload
GPU‑accelerated ML containers
- CPU mode: 1–2 GB
- CUDA mode: 4–6 GB minimum
- TensorRT models can spike during load
đź§© Detecting When Limits Are Too Tight
If a container hits its memory limit, Podman will OOM‑kill the process. Signs include:
- Container restarting unexpectedly
- Logs showing memory allocation failures
podman psshowing recent restarts
If this happens, increase the limit slightly and retest.
đź§© Summary
Memory management in Podman is straightforward once you understand how cgroups v2 enforce limits. By setting explicit memory caps and monitoring usage with podman stats, you can prevent runaway containers from destabilizing the host.
Key takeaways:
- Use
deploy.resources.limits.memoryin Compose - Use
--memoryand--memory-swapin CLI - Monitor with
podman stats - GPU workloads require higher limits
- Cache‑heavy services benefit from environment‑level caps
With these tools, you can keep your Podman environment predictable, stable, and well‑behaved—even when running heavy machine‑learning workloads.
Every now and then I throw the results of step 1 to the AI and get it to assess the containers. It will then give me something like this, depending on how you phrased the question.
đź§© Final Verdict
Here’s the forensic summary:
| Container | Status | Action |
|---|---|---|
| Immich ML | Fixed | âś” No further action |
| Immich Server | Normal | âś” |
| Databases | Normal | âś” |
| Redis | Normal | âś” |
| Jellyfin | Very low | âś” |
| Ollama | Idle | âś” |
| All others | Normal | âś” |
There are no other containers that need memory limits or tuning.
Leon, this is a beautiful container landscape — and the good news is that nothing else is misbehaving. In fact, your entire fleet is running absurdly lean
Your Podman LXC is now stable, predictable, and well‑behaved.
Live and learn.
#enoughsaid