health | extensible health check library for Go applications | Monitoring library

by dimiro1 Go Version: Current License: MIT

X-Ray Key Features Code Snippets(3)Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | health Summary

health is a Go library typically used in Performance Management, Monitoring applications. health has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

An easy to use, extensible health check library for Go applications.

Support

Quality

Security

License

Reuse

Support

health has a low active ecosystem.

It has 420 star(s) with 42 fork(s). There are 6 watchers for this library.

It had no major release in the last 6 months.

There are 0 open issues and 12 have been closed. On average issues are closed in 66 days. There are 3 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of health is current.

Quality

health has 0 bugs and 0 code smells.

Security

health has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

health code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

health is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

health releases are not available. You will need to build from source code and install.

Installation instructions, examples and code snippets are available.

It has 915 lines of code, 87 functions and 17 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of health

Get all kandi verified functions for this library.

health Key Features

No Key Features are available at this moment for health.

health Examples and Code Snippets

Check worker health .

python

Lines of Code : 40

License : Non-SPDX (Apache License 2.0)

Copy

def _check_health(self):
    while True:
      if self._check_health_thread_should_stop.is_set():
        return
      for job in self._cluster_spec.jobs:
        for task_id in range(self._cluster_spec.num_tasks(job)):
          peer = "/job:{}/repl

Start the check health thread .

python

Lines of Code : 35

License : Non-SPDX (Apache License 2.0)

Copy

def _start_check_health_thread(self):
    # Use a dummy all-reduce as a barrier to wait for all workers to be up,
    # otherwise the check health may fail immediately.

    # Use array_ops.identity to create the dummy tensor so that we have a new

Checks the health of the given task .

python

Lines of Code : 21

License : Non-SPDX (Apache License 2.0)

Copy

def check_collective_ops_peer_health(self, task, timeout_in_ms):
    """Check collective peer health.

    This probes each task to see if they're still alive. Note that restarted
    tasks are considered a different one, and they're considered not h

Community Discussions

Trending Discussions on health

Enable use of images from the local library on Kubernetes

Return multiple possible matches when fuzzy joining two dataframes or vectors in R if they share a word in common

How to create a custom health check for Prisma with @nestjs/terminus?

Kubernetes NodePort is not available on all nodes - Oracle Cloud Infrastructure (OCI)

Django - Full test suite failing when adding a TestCase, but full test suite passes when it is commented out. All TestCase pass when run individually

AWX all jobs stop processing and hang indefinitely -- why

How to configure GKE Autopilot w/Envoy & gRPC-Web

How do you use a composite action that exists in a private repository?

ElasticSearch Accessing Nested Documents in Script - Null Pointer Exception

Spring boot 2.6 actuator info

QUESTION

Enable use of images from the local library on Kubernetes

Asked 2022-Mar-20 at 13:23

I'm following a tutorial https://docs.openfaas.com/tutorials/first-python-function/,

currently, I have the right image

...

ANSWER

Answered 2022-Mar-16 at 08:10

If your image has a latest tag, the Pod's ImagePullPolicy will be automatically set to Always. Each time the pod is created, Kubernetes tries to pull the newest image.

Try not tagging the image as latest or manually setting the Pod's ImagePullPolicy to Never. If you're using static manifest to create a Pod, the setting will be like the following:

Source https://stackoverflow.com/questions/71493306

QUESTION

Return multiple possible matches when fuzzy joining two dataframes or vectors in R if they share a word in common

Asked 2022-Mar-15 at 18:03

Is there a way of joining two dataframes via where a row in the first dataframe is joined with every row in the second dataframe if they share a word in common?

For example:

...

ANSWER

Answered 2022-Mar-15 at 18:03

With fuzzy_join:

Source https://stackoverflow.com/questions/71486862

QUESTION

How to create a custom health check for Prisma with @nestjs/terminus?

Asked 2022-Mar-11 at 22:26

Since @nestjs/terminus doesn't provide a health check for Prisma, I'm trying to create it based on their Mongoose health check.

When I try:

...

ANSWER

Answered 2021-Oct-14 at 14:41

A naive copy of the mongoose implementation isn't going to work because there are differences between the NestJSMongoose type/module and Prisma. In particular, getConnectionToken does not exist inside the Prisma package.

I can't comment on what the best way would be to extend terminus to support prisma. You might have to dig a bit into the terminus interface for that. However, a simple way to get a health check/ping in Prisma is to use the following query:

Source https://stackoverflow.com/questions/69568781

QUESTION

Kubernetes NodePort is not available on all nodes - Oracle Cloud Infrastructure (OCI)

Asked 2022-Jan-31 at 14:37

I've been trying to get over this but I'm out of ideas for now hence I'm posting the question here.

I'm experimenting with the Oracle Cloud Infrastructure (OCI) and I wanted to create a Kubernetes cluster which exposes some service.

The goal is:

A running managed Kubernetes cluster (OKE)
2 nodes at least
1 service that's accessible for external parties

The infra looks the following:

A VCN for the whole thing
A private subnet on 10.0.1.0/24
A public subnet on 10.0.0.0/24
NAT gateway for the private subnet
Internet gateway for the public subnet
Service gateway
The corresponding security lists for both subnets which I won't share right now unless somebody asks for it
A containerengine K8S (OKE) cluster in the VCN with public Kubernetes API enabled
A node pool for the K8S cluster with 2 availability domains and with 2 instances right now. The instances are ARM machines with 1 OCPU and 6GB RAM running Oracle-Linux-7.9-aarch64-2021.12.08-0 images.
A namespace in the K8S cluster (call it staging for now)
A deployment which refers to a custom NextJS application serving traffic on port 3000

And now it's the point where I want to expose the service running on port 3000.

I have 2 obvious choices:

Create a LoadBalancer service in K8S which will spawn a classic Load Balancer in OCI, set up it's listener and set up the backendset referring to the 2 nodes in the cluster, plus it adjusts the subnet security lists to make sure traffic can flow
Create a Network Load Balancer in OCI and create a NodePort on K8S and manually configure the NLB to the ~same settings as the classic Load Balancer

The first one works perfectly fine but I want to use this cluster with minimal costs so I decided to experiment with option 2, the NLB since it's way cheaper (zero cost).

Long story short, everything works and I can access the NextJS app on the IP of the NLB most of the time but sometimes I couldn't. I decided to look it up what's going on and turned out the NodePort that I exposed in the cluster isn't working how I'd imagine.

The service behind the NodePort is only accessible on the Node that's running the pod in K8S. Assume NodeA is running the service and NodeB is just there chilling. If I try to hit the service on NodeA, everything is fine. But when I try to do the same on NodeB, I don't get a response at all.

That's my problem and I couldn't figure out what could be the issue.

What I've tried so far:

Switching from ARM machines to AMD ones - no change
Created a bastion host in the public subnet to test which nodes are responding to requests. Turned out only the node responds that's running the pod.
Created a regular LoadBalancer in K8S with the same config as the NodePort (in this case OCI will create a classic Load Balancer), that works perfectly
Tried upgrading to Oracle 8.4 images for the K8S nodes, didn't fix it
Ran the Node Doctor on the nodes, everything is fine
Checked the logs of kube-proxy, kube-flannel, core-dns, no error
Since the cluster consists of 2 nodes, I gave it a try and added one more node and the service was not accessible on the new node either
Recreated the cluster from scratch

Edit: Some update. I've tried to use a DaemonSet instead of a regular Deployment for the pod to ensure that as a temporary solution, all nodes are running at least one instance of the pod and surprise. The node that was previously not responding to requests on that specific port, it still does not, even though a pod is running on it.

Edit2: Originally I was running the latest K8S version for the cluster (v1.21.5) and I tried downgrading to v1.20.11 and unfortunately the issue is still present.

Edit3: Checked if the NodePort is open on the node that's not responding and it is, at least kube-proxy is listening on it.

...

ANSWER

Answered 2022-Jan-31 at 12:06

Might not be the ideal fix, but can you try changing the externalTrafficPolicy to Local. This would prevent the health check on the nodes which don't run the application to fail. This way the traffic will only be forwarded to the node where the application is . Setting externalTrafficPolicy to local is also a requirement to preserve source IP of the connection. Also, can you share the health check config for both NLB and LB that you are using. When you change the externalTrafficPolicy, note that the health check for LB would change and the same needs to be applied to NLB.

Edit: Also note that you need a security list/ network security group added to your node subnet/nodepool, which allows traffic on all protocols from the worker node subnet.

Source https://stackoverflow.com/questions/70893487

QUESTION

Django - Full test suite failing when adding a TestCase, but full test suite passes when it is commented out. All TestCase pass when run individually

Asked 2021-Dec-23 at 10:31

So this seems to be an issue talked about here and there on StackOverflow with no real solution. So I have a bunch of tests that all pass when run individual. They even pass when run as a full test suite, EXCEPT when I add in my TestCase ExploreFeedTest. Now ExploreFeedTest passes when run by itself and it actually doesn't fail when run in the full test suite as in running python manage.py test, it causes another test HomeTest to fail, which passes on it's own and passes when ExploreFeedTest is commented out from the init.py under the test folder. I hear this is an issue with Django not cleaning up data properly? All my TestCase classes are from django.test.TestCase, because apparently if you don't use that class Django doesn't teardown the data properly, so I don't really know how to solve this. I'm also running Django 3.2.9, which is supposedly the latest. Anyone have a solution for this?

ExploreFeedTest.py

...

ANSWER

Answered 2021-Dec-23 at 10:31

I posted the answer on the stack overflow question

Django - Serializer throwing "Invalid pk - object does not exist" when setting ManyToMany attribute where foreign keyed object does exist

I was also using factory boy, which doesn't seem to play nice with test suite. Test suite doesn't seem to know how to rollback the DB without getting rid of factory boy generated data.

Source https://stackoverflow.com/questions/70105907

QUESTION

AWX all jobs stop processing and hang indefinitely -- why

Asked 2021-Dec-21 at 14:42

Problem

We've had a working Ansible AWX instance running on v5.0.0 for over a year, and suddenly all jobs stop working -- no output is rendered. They will start "running" but hang indefinitely without printing out any logging.

The AWX instance is running in a docker compose container setup as defined here: https://github.com/ansible/awx/blob/5.0.0/INSTALL.md#docker-compose

Observations

Standard troubleshooting such as restarting of containers, host OS, etc. hasn't helped. No configuration changes in either environment.

Upon debugging an actual playbook command, we observe that the command to run a playbook from the UI is like the below:

ssh-agent sh -c ssh-add /tmp/awx_11021_0fmwm5uz/artifacts/11021/ssh_key_data && rm -f /tmp/awx_11021_0fmwm5uz/artifacts/11021/ssh_key_data && ansible-playbook -vvvvv -u ubuntu --become --ask-vault-pass -i /tmp/awx_11021_0fmwm5uz/tmppo7rcdqn -e @/tmp/awx_11021_0fmwm5uz/env/extravars playbook.yml

That's broken down into three commands in sequence:

ssh-agent sh -c ssh-add /tmp/awx_11021_0fmwm5uz/artifacts/11021/ssh_key_data
rm -f /tmp/awx_11021_0fmwm5uz/artifacts/11021/ssh_key_data
ansible-playbook -vvvvv -u ubuntu --become --ask-vault-pass -i /tmp/awx_11021_0fmwm5uz/tmppo7rcdqn -e @/tmp/awx_11021_0fmwm5uz/env/extravars playbook.yml

You can see in part 3, the -vvvvv is the debugging argument -- however, the hang is happening on command #1. Which has nothing to do with ansible or AWX specifically, but it's not going to get us much debugging info.

I tried doing an strace to see what is going on, but for reasons given below, it is pretty difficult to follow what it is actually hanging on. I can provide this output if it might help.

Analysis

So one natural question with command #1 -- what is 'ssh_key_data'?

Well it's what we set up to be the Machine credential in AWX (an SSH key) -- it hasn't changed in a while and it works just fine when used in a direct SSH command. It's also apparently being set up by AWX as a file pipe:

prw------- 1 root root 0 Dec 10 08:29 ssh_key_data

Which starts to explain why it could be potentially hanging (if nothing is being read in from the other side of the pipe).

Running a normal ansible-playbook from command line (and supplying the SSH key in a more normal way) works just fine, so we can still deploy, but only via CLI right now -- it's just AWX that is broken.

Conclusions

So the question then becomes "why now"? And "how to debug"? I have checked the health of awx_postgres, and verified that indeed the Machine credential is present in an expected format (in the main_credential table). I have also verified that can use ssh-agent on the awx_task container without the use of that pipe keyfile. So it really seems to be this piped file that is the problem -- but I haven't been able to glean from any logs where the other side of the pipe (sender) is supposed to be or why they aren't sending the data.

...

ANSWER

Answered 2021-Dec-13 at 04:21

Had the same issue starting this Friday in the same timeframe as you. Turned out that Crowdstrike (falcon sensor) Agent was the culprit. I'm guessing they pushed a definition update that is breaking or blocking fifo pipes. When we stopped the CS agent, AWX started working correctly again, with no issues. See if you are running a similar security product.

Source https://stackoverflow.com/questions/70320452

QUESTION

How to configure GKE Autopilot w/Envoy & gRPC-Web

Asked 2021-Dec-14 at 20:31

I have an application running on my local machine that uses React -> gRPC-Web -> Envoy -> Go app and everything runs with no problems. I'm trying to deploy this using GKE Autopilot and I just haven't been able to get the configuration right. I'm new to all of GCP/GKE, so I'm looking for help to figure out where I'm going wrong.

I was following this doc initially, even though I only have one gRPC service: https://cloud.google.com/architecture/exposing-grpc-services-on-gke-using-envoy-proxy

From what I've read, GKE Autopilot mode requires using External HTTP(s) load balancing instead of Network Load Balancing as described in the above solution, so I've been trying to get that to work. After a variety of attempts, my current strategy has an Ingress, BackendConfig, Service, and Deployment. The deployment has three containers: my app, an Envoy sidecar to transform the gRPC-Web requests and responses, and a cloud SQL proxy sidecar. I eventually want to be using TLS, but for now, I left that out so it wouldn't complicate things even more.

When I apply all of the configs, the backend service shows one backend in one zone and the health check fails. The health check is set for port 8080 and path /healthz which is what I think I've specified in the deployment config, but I'm suspicious because when I look at the details for the envoy-sidecar container, it shows the Readiness probe as: http-get HTTP://:0/healthz headers=x-envoy-livenessprobe:healthz. Does ":0" just mean it's using the default address and port for the container, or does indicate a config problem?

I've been reading various docs and just haven't been able to piece it all together. Is there an example somewhere that shows how this can be done? I've been searching and haven't found one.

My current configs are:

...

ANSWER

Answered 2021-Oct-14 at 22:35

Here is some documentation about Setting up HTTP(S) Load Balancing with Ingress. This tutorial shows how to run a web application behind an external HTTP(S) load balancer by configuring the Ingress resource.

Related to Creating a HTTP Load Balancer on GKE using Ingress, I found two threads where instances created are marked as unhealthy.

In the first one, they mention the necessity to manually enable a firewall rule to allow http load balancer ip range to pass health check.

In the second one, they mention that the Pod’s spec must also include containerPort. Example:

Source https://stackoverflow.com/questions/69560536

QUESTION

How do you use a composite action that exists in a private repository?

Asked 2021-Dec-14 at 04:10

We have a bunch of health checks against third-party services. We want them to run periodically because when they go down it affects our app just like a bug in our code. Knowing that "it's them not us" reduces significant troubleshooting time.

We've set this health check up via github actions with a scheduled run, but we want a HealthCheck per third-party service. That way, the slack message on failure will be very specific of what is down. But that is going to create a lot of duplicated yml content.

I discovered something called github composite actions and it seems to be intended for solving this problem, but I can't find information about whether or not a composite action can live in a private repository.

The documentation of the uses key only mentions public repositories when it mentions repositories at all. Is there a way to make a composite action in a private repository and use it?

I tried making their hello world example, ran it, and it ran correctly. Then I made the action repo private, and the repo using the action's build failed saying:

...

ANSWER

Answered 2021-Dec-14 at 04:10

You have to check out the repository containing your action using a personal access token first, then use a relative path to where you checked it out:

Source https://stackoverflow.com/questions/69034292

QUESTION

ElasticSearch Accessing Nested Documents in Script - Null Pointer Exception

Asked 2021-Dec-07 at 10:49

Gist: Trying to write a custom filter on nested documents using painless. Want to write error checks when there are no nested documents to surpass null_pointer_exception

I have a mapping as such (simplified and obfuscated)

...

ANSWER

Answered 2021-Dec-07 at 10:49

TLDr;

Elastic flatten objects. Such that

Source https://stackoverflow.com/questions/70217177

QUESTION

Spring boot 2.6 actuator info

Asked 2021-Nov-26 at 14:27

I am using actuator and in the application.properties file i have the following fields

...

ANSWER

Answered 2021-Nov-26 at 14:03

I solved the problem, just add in the application.properties file

Source https://stackoverflow.com/questions/70123650

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install health

This package is a go getable package.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: