# Migration Notice **NOTE**: The source code for the `nvidia-container-runtime` binary has been moved to the [`nvidia-container-toolkit`](https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/tree/main/cmd/nvidia-container-runtime) repository. It is now included in the `nvidia-container-toolkit` package and the `nvidia-container-runtime` package defined in this repository is a meta-package that allows workflows that referred to this package directly to continue to function without modification. # nvidia-container-runtime [![GitHub license](https://img.shields.io/badge/License-Apache%202.0-blue.svg?style=flat-square)](https://raw.githubusercontent.com/NVIDIA/nvidia-container-runtime/main/LICENSE) [![Package repository](https://img.shields.io/badge/packages-repository-b956e8.svg?style=flat-square)](https://nvidia.github.io/nvidia-container-runtime) A modified version of [runc](https://github.com/opencontainers/runc) adding a custom [pre-start hook](https://github.com/opencontainers/runtime-spec/blob/main/config.md#prestart) to all containers. If environment variable `NVIDIA_VISIBLE_DEVICES` is set in the OCI spec, the hook will configure GPU access for the container by leveraging `nvidia-container-cli` from project [libnvidia-container](https://github.com/NVIDIA/libnvidia-container). ## Usage example ```sh # Setup a rootfs based on Ubuntu 16.04 cd $(mktemp -d) && mkdir rootfs curl -sS http://cdimage.ubuntu.com/ubuntu-base/releases/16.04/release/ubuntu-base-16.04-core-amd64.tar.gz | tar --exclude 'dev/*' -C rootfs -xz # Create an OCI runtime spec nvidia-container-runtime spec sed -i 's;"sh";"nvidia-smi";' config.json sed -i 's;\("TERM=xterm"\);\1, "NVIDIA_VISIBLE_DEVICES=0";' config.json # Run the container sudo nvidia-container-runtime run nvidia_smi ``` ## Installation #### Ubuntu distributions 1. Install the repository for your distribution by following the instructions [here](http://nvidia.github.io/nvidia-container-runtime/). 2. Install the `nvidia-container-runtime` package: ``` sudo apt-get install nvidia-container-runtime ``` #### CentOS distributions 1. Install the repository for your distribution by following the instructions [here](http://nvidia.github.io/nvidia-container-runtime/). 2. Install the `nvidia-container-runtime` package: ``` sudo yum install nvidia-container-runtime ``` ## Docker Engine setup **Do not follow this section if you installed the `nvidia-docker2` package, it already registers the runtime.** To register the `nvidia` runtime, use the method below that is best suited to your environment. You might need to merge the new argument with your existing configuration. #### Systemd drop-in file ```bash sudo mkdir -p /etc/systemd/system/docker.service.d sudo tee /etc/systemd/system/docker.service.d/override.conf <:` as seen in the example output: ``` $ nvidia-smi -L GPU 0: Graphics Device (UUID: GPU-b8ea3855-276c-c9cb-b366-c6fa655957c5) MIG Device 0: (UUID: MIG-GPU-b8ea3855-276c-c9cb-b366-c6fa655957c5/1/0) MIG Device 1: (UUID: MIG-GPU-b8ea3855-276c-c9cb-b366-c6fa655957c5/1/1) MIG Device 2: (UUID: MIG-GPU-b8ea3855-276c-c9cb-b366-c6fa655957c5/11/0) ``` ### `NVIDIA_MIG_CONFIG_DEVICES` This variable controls which of the visible GPUs can have their MIG configuration managed from within the container. This includes enabling and disabling MIG mode, creating and destroying GPU Instances and Compute Instances, etc. #### Possible values * `all`: Allow all MIG-capable GPUs in the visible device list to have their MIG configurations managed. **Note**: * This feature is only available on MIG capable devices (e.g. the A100). * To use this feature, the container must be started with `CAP_SYS_ADMIN` privileges. * When not running as `root`, the container user must have read access to the `/proc/driver/nvidia/capabilities/mig/config` file on the host. ### `NVIDIA_MIG_MONITOR_DEVICES` This variable controls which of the visible GPUs can have aggregate information about all of their MIG devices monitored from within the container. This includes inspecting the aggregate memory usage, listing the aggregate running processes, etc. #### Possible values * `all`: Allow all MIG-capable GPUs in the visible device list to have their MIG devices monitored. **Note**: * This feature is only available on MIG capable devices (e.g. the A100). * To use this feature, the container must be started with `CAP_SYS_ADMIN` privileges. * When not running as `root`, the container user must have read access to the `/proc/driver/nvidia/capabilities/mig/monitor` file on the host. ### `NVIDIA_DRIVER_CAPABILITIES` This option controls which driver libraries/binaries will be mounted inside the container. #### Possible values * `compute,video`, `graphics,utility` …: a comma-separated list of driver features the container needs. * `all`: enable all available driver capabilities. * *empty* or *unset*: use default driver capability: `utility,compute`. #### Supported driver capabilities * `compute`: required for CUDA and OpenCL applications. * `compat32`: required for running 32-bit applications. * `graphics`: required for running OpenGL and Vulkan applications. * `utility`: required for using `nvidia-smi` and NVML. * `video`: required for using the Video Codec SDK. * `display`: required for leveraging X11 display. ### `NVIDIA_REQUIRE_*` A logical expression to define constraints on the configurations supported by the container. #### Supported constraints * `cuda`: constraint on the CUDA driver version. * `driver`: constraint on the driver version. * `arch`: constraint on the compute architectures of the selected GPUs. * `brand`: constraint on the brand of the selected GPUs (e.g. GeForce, Tesla, GRID). #### Expressions Multiple constraints can be expressed in a single environment variable: space-separated constraints are ORed, comma-separated constraints are ANDed. Multiple environment variables of the form `NVIDIA_REQUIRE_*` are ANDed together. ### `NVIDIA_DISABLE_REQUIRE` Single switch to disable all the constraints of the form `NVIDIA_REQUIRE_*`. ### `NVIDIA_REQUIRE_CUDA` The version of the CUDA toolkit used by the container. It is an instance of the generic `NVIDIA_REQUIRE_*` case and it is set by official CUDA images. If the version of the NVIDIA driver is insufficient to run this version of CUDA, the container will not be started. #### Possible values * `cuda>=7.5`, `cuda>=8.0`, `cuda>=9.0` …: any valid CUDA version in the form `major.minor`. ### `CUDA_VERSION` Similar to `NVIDIA_REQUIRE_CUDA`, for legacy CUDA images. In addition, if `NVIDIA_REQUIRE_CUDA` is not set, `NVIDIA_VISIBLE_DEVICES` and `NVIDIA_DRIVER_CAPABILITIES` will default to `all`. ## Issues and Contributing [Checkout the Contributing document!](CONTRIBUTING.md) * Please let us know by [filing a new issue](https://github.com/NVIDIA/nvidia-container-toolkit/issues/new) * You can contribute by creating a [merge request](https://gitlab.com/nvidia/container-toolkit/container-runtime/-/merge_requests/new) to our public GitLab repository