k8s-device-plugin | device plugin to enable registration | Continuous Deployment library
kandi X-RAY | k8s-device-plugin Summary
kandi X-RAY | k8s-device-plugin Summary
This is a Kubernetes device plugin implementation that enables the registration of AMD GPU in a container cluster for compute workload. With the approrpriate hardware and this plugin deployed in your Kubernetes cluster, you will be able to run jobs that require AMD GPU. More information about RadeonOpenCompute (ROCm).
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Main entry point .
- GetFirmwareVersions gets the firmware version information for the given card .
- Count GPU devices
- openAMDGPU opens an AMDGPU device .
- GetAMDGPUs returns a map of device IDs
- FamilyIDtoString converts a family id to a string
- parseDebugFSFirmwareInfo provides a function to parse the details of the debugfs file
- ParseTopologyProperties is like ParseTopologyProperties but accepts a regular expression as a regular expression
- GetCardFamilyName returns the family name of the card family .
- AMDGPU returns true if the card is supported
k8s-device-plugin Key Features
k8s-device-plugin Examples and Code Snippets
Community Discussions
Trending Discussions on k8s-device-plugin
QUESTION
I want to install Kubernetes and docker 19.03 with NVIDIA GPU supporting. Before docker 19.03, the default rumtime of docker needs to be assigned to nvidia. Now the method is not supported, the recommend method is to insert "--gpus all" in command line. Is there any way to make "--gpus all" as the default setting of docker? It is also acceptable to change the command of Kubernetes for invoking docker, but I have not found the solution. BTW, I don't want to use NVIDIA's k8s-device-plugin because I want to control GPUs by myself. I just need all GPUs are exposed to PODs.
...ANSWER
Answered 2020-Feb-27 at 18:23According to NVIDIA's documents, we need to install Nvidia docker 2.0 even if it is not a recommended method. After the installing, you can set the Nvidia runtime as the default. Kubernetes cannot support the new command "--gpus all" currently.
QUESTION
I am trying to setup one small kubenertes cluster on my ubuntu 18.04 LTS server. Now every step is done, but checking the GPU status fails. The container keeps reporting errors:
1. Issue Description
I have done steps by Quick-Start, but when I run the test case, it reports error.
2. Steps to reproduce the issue
exec shell cmd
docker run --security-opt=no-new-privileges --cap-drop=ALL --network=none -it -v /var/lib/kubelet/device-plugins:/var/lib/kubelet/device-plugins nvidia/k8s-device-plugin:1.9
check the erros
2020/02/09 00:20:15 Starting to serve on /var/lib/kubelet/device-plugins/nvidia.sock
2020/02/09 00:20:15 Could not register device plugin: rpc error: code = Unimplemented desc = unknown service deviceplugin.Registration
2020/02/09 00:20:15 Could not contact Kubelet, retrying. Did you enable the device plugin feature gate?
2020/02/09 00:20:15 You can check the prerequisites at: https://github.com/NVIDIA/k8s-device-plugin#prerequisites
2020/02/09 00:20:15 You can learn how to set the runtime at: https://github.com/NVIDIA/k8s-device-plugin#quick-start
3. Environment Information
- outputs of nvidia-docker run --rm dlws/cuda nvidia-smi
NVIDIA-SMI 440.48.02 Driver Version: 440.48.02 CUDA Version: 10.2
- outputs of nvidia-docker run --rm dlws/cuda nvidia-smi
NVIDIA-SMI 440.48.02 Driver Version: 440.48.02 CUDA Version: 10.2
- contents of /etc/docker/daemon.json
contents:
...ANSWER
Answered 2020-Feb-24 at 05:25Finally I found the answer, hope this post would be helpful for others who encounter the same issue:
For kubernetes 1.15, use k8s-device-plugin:1.11 instead. The version 1.9 is not able to communicate with kubelet.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install k8s-device-plugin
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page