swarmprom | Docker Swarm instrumentation with Prometheus Grafana | Continuous Deployment library

 by   stefanprodan Shell Version: v1.8 License: MIT

kandi X-RAY | swarmprom Summary

kandi X-RAY | swarmprom Summary

swarmprom is a Shell library typically used in Devops, Continuous Deployment, Docker, Prometheus, Grafana applications. swarmprom has no bugs, it has no vulnerabilities, it has a Permissive License and it has medium support. You can download it from GitHub.

Swarmprom is a starter kit for Docker Swarm monitoring with Prometheus, Grafana, cAdvisor, Node Exporter, Alert Manager and Unsee.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              swarmprom has a medium active ecosystem.
              It has 1823 star(s) with 718 fork(s). There are 66 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 62 open issues and 47 have been closed. On average issues are closed in 66 days. There are 4 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of swarmprom is v1.8

            kandi-Quality Quality

              swarmprom has 0 bugs and 0 code smells.

            kandi-Security Security

              swarmprom has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              swarmprom code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              swarmprom is licensed under the MIT License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              swarmprom releases are available to install and integrate.
              Installation instructions, examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of swarmprom
            Get all kandi verified functions for this library.

            swarmprom Key Features

            No Key Features are available at this moment for swarmprom.

            swarmprom Examples and Code Snippets

            No Code Snippets are available at this moment for swarmprom.

            Community Discussions

            QUESTION

            Error query in Grafana/Prometheus with node-exporter in docker swarm mode
            Asked 2021-Mar-08 at 16:32

            I am struggling a query problem with Grafana variable query in Dashboard configuration. The query variable should return the number of nodes joined the swarm but it did not. In my case, I only have one swarm node but the variable in Grafana returns up to 5 nodes. I relly don't understand what causes the error.

            Here is the situation: I set up docker swarm in my laptop as a manager, only my laptop with the swarm mode, no other nodes joined. I used the source from https://github.com/stefanprodan/swarmprom to monitor the host by node-exporter. I kept the prometheus.yml as original.

            when I executes the metric from prometheus, only one host returned. This is correct because I only had one node. You can see the figure below

            But when I did the query in Grafana, Grafana returned 5 hosts. It was really strange here. I dont know why I got 5 hosts because I had only one swarm node.

            I did check the git repo again with play-with-docker, configured one manager node and 2 client nodes. Everything worked fine. The query in Grafana returned 3 hosts.

            Here is the query formula: label_values(node_uname_info{job="node-exporter"}, instance) Thank you so much for you supporting in advance.

            ...

            ANSWER

            Answered 2021-Mar-08 at 16:32

            What you have faced is a consequence of ephemeral container nature, one of the challenges in monitoring container applications. Before we go into any solution options, let us see ...

            How it did happen that Grafana shows more instances than there is.

            Prometheus is a time-series database. Once in a while it contacts its scraping targets and collects metrics. Those metrics are saved with a time stamp and a set of labels, one of which is the 'instance' label in question.

            The instance label normally consists of an address (a host/domain name or an IP-address) and a port, that prometheus uses to scrape metrics. In this example instance address is an IP-address, because the list of targets is obtained through a DNS-server (dns_sd_configs in job definition).

            When you deployed the stack, docker created at least one container for each service, including node-exporter and prometheus. Soon after that prometheus started obtaining metrics from node-exporter instance, however after some time node-exporter container was recreated. Either you updated it, or killed it, or it's crashed - I can't know, but the key is - you had a new container. The new node-exporter container got a different IP-address and because of that metrics from the new instance received a different 'instance' label.

            Remember that prometheus is a time series database? You have not lost metrics from the instance that went offline, they're still in the database. It is just at this point you had started collecting node-exporter metrics with a different label set (new IP-address in the 'instance' label at least). When Grafana queries labels for you, it requests metrics from the period currently set on the dashboard. Since the period was 'today', you've seen instances that were present today. In other words when you request a list of possible instance values, you receive a list of values for the period without any filtering for currently active instances.

            General solution.

            You need to use some static label(s) for this task. An 'instance' or a 'pod_name' (K8s) labels are a poor choice if you don't like to see dead instances in the list. Pick a label that represents the thing or unit you want to watch and stick to it. Since node-exporter is to monitor node metrics, I think a host name label will do.

            If you see no way in avoiding use of dynamic labels, you can use a short time range on the dashboard, so that the label_values() function does not return long dead labels. You'd like to set variable refresh option to 'On Time Range Change', so that you can use a short dashboard interval to see and pick currently active instances, and a long period for any other case.

            An option for this particular problem.

            As I said previously, using a host name label will be better in this case. The problem is - there is no such label in the metric in question. Checking swarmprom repo, I found that this node-exporter was made to expose a host name via node_meta label (here). So it is possible to map a host name to an instance(s) using chained variables.

            Another problem is that this solution may require changes in panel queries. Since one host name can resolve into multiple instances, it is essential that panel queries use regex match for 'instance' label (that is =~ instead of =).

            Here's how to do all this:

            1. Create a new variable called 'hostname', set refresh option to 'On Time Range Change', and use this for the query field:

            Source https://stackoverflow.com/questions/66518524

            QUESTION

            How to configurate prometheus.yml to scrape only running containers for node-exporter
            Asked 2021-Mar-08 at 15:42

            I have a problem with the grafana/prometheus when I used node-exporter to collect host's resources from docker swarm nodes.

            I tested with only one swarm node. When I used the query
            label_values(node_uname_info{job="node-exporter"}, instance) in Grafana variables. The result returned the old ip of stopped containers and the ips of running container as well. I want it only returns the ip of running container. You can see the image below, it shows the ip of node-exported containers all the time.

            But actually, one one container is running with the ip 10.0.1.12:9100. The other ips were the old ip of node-exporter containers that started and stopped. Here is the time-series that these contianer were created.

            I think we can configurate the scrape method in prometheus.yml with the #relabel_config but I am not familiar with it. Here is the scrape method I got from https://github.com/stefanprodan/swarmprom.

            ...

            ANSWER

            Answered 2021-Mar-08 at 15:42

            Based on the last comment, you can modify the queries using the following pattern:

            Source https://stackoverflow.com/questions/66526433

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install swarmprom

            Clone this repository and run the monitoring stack:.
            Docker CE 17.09.0-ce or Docker EE 17.06.2-ee-3
            Swarm cluster with one manager and a worker node
            Docker engine experimental enabled and metrics address set to 0.0.0.0:9323
            prometheus (metrics database) http://<swarm-ip>:9090
            grafana (visualize metrics) http://<swarm-ip>:3000
            node-exporter (host metrics collector)
            cadvisor (containers metrics collector)
            dockerd-exporter (Docker daemon metrics collector, requires Docker experimental metrics-addr to be enabled)
            alertmanager (alerts dispatcher) http://<swarm-ip>:9093
            unsee (alert manager dashboard) http://<swarm-ip>:9094
            caddy (reverse proxy and basic auth provider for prometheus, alertmanager and unsee)
            If you have a Docker Swarm cluster with a global Traefik set up as described in DockerSwarm.rocks, you can deploy Swarmprom integrated with that global Traefik proxy. This way, each Swarmprom service will have its own domain, and each of them will be served using HTTPS, with certificates generated (and renewed) automatically.
            Navigate to http://<swarm-ip>:3000 and login with user admin password admin. You can change the credentials in the compose file or by supplying the ADMIN_USER and ADMIN_PASSWORD environment variables at stack deploy.
            Name: Prometheus
            Type: Prometheus
            Url: http://prometheus:9090
            Access: proxy
            Cluster up-time, number of nodes, number of CPUs, CPU idle gauge
            System load average graph, CPU usage graph by node
            Total memory, available memory gouge, total disk space and available storage gouge
            Memory usage graph by node (used and cached)
            I/O usage graph (read and write Bps)
            IOPS usage (read and write operation per second) and CPU IOWait
            Running containers graph by Swarm service and node
            Network usage graph (inbound Bps, outbound Bps)
            Nodes list (instance, node ID, node name)
            Number of nodes, stacks, services and running container
            Swarm tasks graph by service name
            Health check graph (total health checks and failed checks)
            CPU usage graph by service and by container (top 10)
            Memory usage graph by service and by container (top 10)
            Network usage graph by service (received and transmitted)
            Cluster network traffic and IOPS graphs
            Docker engine container and network actions by node
            Docker engine list (version, node id, OS, kernel, graph driver)
            Uptime, local storage memory chunks and series
            CPU usage graph
            Memory usage graph
            Chunks to persist and persistence urgency graphs
            Chunks ops and checkpoint duration graphs
            Target scrapes, rule evaluation duration, samples ingested rate and scrape duration graphs

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/stefanprodan/swarmprom.git

          • CLI

            gh repo clone stefanprodan/swarmprom

          • sshUrl

            git@github.com:stefanprodan/swarmprom.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link