dcgm-exporter | NVIDIA GPU metrics exporter for Prometheus leveraging DCGM | GPU library

 by   NVIDIA Go Version: 3.1.8-3.1.5 License: Apache-2.0

kandi X-RAY | dcgm-exporter Summary

kandi X-RAY | dcgm-exporter Summary

dcgm-exporter is a Go library typically used in Hardware, GPU, Prometheus applications. dcgm-exporter has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

This repository contains the DCGM-Exporter project. It exposes GPU metrics exporter for Prometheus leveraging NVIDIA DCGM.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              dcgm-exporter has a low active ecosystem.
              It has 341 star(s) with 79 fork(s). There are 14 watchers for this library.
              There were 1 major release(s) in the last 12 months.
              There are 60 open issues and 38 have been closed. On average issues are closed in 12 days. There are 3 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of dcgm-exporter is 3.1.8-3.1.5

            kandi-Quality Quality

              dcgm-exporter has 0 bugs and 0 code smells.

            kandi-Security Security

              dcgm-exporter has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              dcgm-exporter code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              dcgm-exporter is licensed under the Apache-2.0 License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              dcgm-exporter releases are available to install and integrate.
              Installation instructions, examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi has reviewed dcgm-exporter and discovered the below as its top functions. This is intended to give you an instant insight into dcgm-exporter implemented functionality, and help decide if they suit your requirements.
            • main is the main entrypoint .
            • Run executes the dcgm command
            • InitializeSystemInfo initializes the system information .
            • parseDeviceOptions parses DOSA device options
            • parseDeviceOptionsToken parses device options
            • extractCounters extracts the counters from a list of records .
            • ToString converts a FieldValue to a string
            • ToDeviceToPod converts a ListResourcesResponse to a PodInfo map
            • ToMetric converts a field value to metrics . Metric .
            • SetupDcgmFieldsWatch creates a set of watchers for the given devices .
            Get all kandi verified functions for this library.

            dcgm-exporter Key Features

            No Key Features are available at this moment for dcgm-exporter.

            dcgm-exporter Examples and Code Snippets

            No Code Snippets are available at this moment for dcgm-exporter.

            Community Discussions

            QUESTION

            On GKE, dcgm-exporter pod fails to run if the nvidia.com/gpu resource is not allocated
            Asked 2020-Nov-23 at 03:40

            I am trying to query GPU usage metrics of GKE pods.

            Here is what I've done for test:

            1. Created GKE cluster with two node pools, one of them has two cpu-only nodes and the other has one node with NVIDIA Tesla T4 GPU. All nodes are running Container-Optimized OS.
            2. As written in https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#installing_drivers, I ran kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded.yaml.
            3. kubectl create -f dcgm-exporter.yaml
            ...

            ANSWER

            Answered 2020-Nov-23 at 03:40

            It worked with these:

            1. Set privileged: true to securityContext.
            2. Add volume mount "nvidia-install-dir-host".

            Source https://stackoverflow.com/questions/64940013

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install dcgm-exporter

            To gather metrics on a GPU node, simply start the dcgm-exporter container:.
            Note: Consider using the NVIDIA GPU Operator rather than DCGM-Exporter directly. Ensure you have already setup your cluster with the default runtime as NVIDIA.

            Support

            Official documentation for DCGM-Exporter can be found on docs.nvidia.com.
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/NVIDIA/dcgm-exporter.git

          • CLI

            gh repo clone NVIDIA/dcgm-exporter

          • sshUrl

            git@github.com:NVIDIA/dcgm-exporter.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Explore Related Topics

            Reuse Pre-built Kits with dcgm-exporter

            Consider Popular GPU Libraries

            taichi

            by taichi-dev

            gpu.js

            by gpujs

            hashcat

            by hashcat

            cupy

            by cupy

            EASTL

            by electronicarts

            Try Top Libraries by NVIDIA

            DeepLearningExamples

            by NVIDIAJupyter Notebook

            FastPhotoStyle

            by NVIDIAPython

            vid2vid

            by NVIDIAPython

            TensorRT

            by NVIDIAC++