dns | Docker DNS server on steroids to access DNS | DNS library

by qdm12 Go Version: v2.0.0-rc0 License: MIT

X-Ray Key Features Code Snippets(2)Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | dns Summary

dns is a Go library typically used in Networking, DNS, Docker applications. dns has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

DNS over TLS upstream server connected to DNS over TLS (IPv4 and IPv6) servers with DNSSEC, DNS rebinding protection, built-in Docker healthcheck and fine grain IPs + hostnames blocking. Announcement: *You can now try :v2.0.0-beta with this documentation.

Support

Quality

Security

License

Reuse

Support

dns has a low active ecosystem.

It has 172 star(s) with 33 fork(s). There are 11 watchers for this library.

It had no major release in the last 12 months.

There are 22 open issues and 65 have been closed. On average issues are closed in 102 days. There are 1 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of dns is v2.0.0-rc0

Quality

dns has 0 bugs and 0 code smells.

Security

dns has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

dns code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

dns is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

dns releases are available to install and integrate.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of dns

Get all kandi verified functions for this library.

dns Key Features

No Key Features are available at this moment for dns.

dns Examples and Code Snippets

Log a DNS start event .

java

Lines of Code : 4

License : Permissive (MIT License)

Copy

@Override
    public void dnsStart(Call call, String domainName) {
        logTimedEvent("dnsStart");
    }

End of a DNS record .

java

Lines of Code : 4

License : Permissive (MIT License)

Copy

@Override
    public void dnsEnd(Call call, String domainName, List inetAddressList) {
        logTimedEvent("dnsEnd");
    }

Community Discussions

Trending Discussions on dns

CentOS through a VM - no URLs in mirrorlist

Visual Studio Code "Error while fetching extensions. XHR failed"

Kubernetes NodePort is not available on all nodes - Oracle Cloud Infrastructure (OCI)

Why is ArgoCD confusing GitHub.com with my own public IP?

System.AggregateException thrown when trying to access server on Xamarin.Forms Android

Ping Tasks will not complete

Address already in use for puma-dev

The pgAdmin 4 server could not be contacted: Fatal error

What is the purpose of the "[fingerprint]" option during SSH host authenticity check?

Does .NET Framework have an OS-independent global DNS cache?

QUESTION

CentOS through a VM - no URLs in mirrorlist

Asked 2022-Mar-26 at 21:04

I am trying to run a CentOS 8 server through VirtualBox (6.1.30) (Vagrant), which worked just fine yesterday for me, but today I tried running a sudo yum update. I keep getting this error for some reason:

...

ANSWER

Answered 2022-Mar-26 at 20:59

Check out this article: CentOS Linux EOL

The below commands helped me:

Source https://stackoverflow.com/questions/70926799

QUESTION

Visual Studio Code "Error while fetching extensions. XHR failed"

Asked 2022-Mar-13 at 12:38

This problem started a few weeks ago, when I started using NordVPN on my laptop. When I try to search for an extension and even when trying to download through the marketplace I get this error:

EDIT: Just noticed another thing that might indicate to what's causing the issue. When I open VSCode and go to developer tools I get this error messege (before even doing anything):

"(node:19368) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.(Use Code --trace-deprecation ... to show where the warning was created)"

The only partial solution I found so far was to manually download and install extensions.

I've checked similar question here and in other places online, but I didn't find a way to fix this. So far I've tried:

Flushing my DNS cache and setting it to google's DNS server.
Disabling the VPN on my laptop and restarting VS Code.
Clearing the Extension search results.
Disabling all the extensions currently running.

I'm using a laptop running Windows 10. Any other possible solutions I haven't tried?

...

ANSWER

Answered 2021-Dec-10 at 05:26

December 10,2021.
I'm using vscode with ubuntu 20.04.
I came across the XHR errors from yesterday and could not install any extensions.
Googled a lot but nothing works.
Eventually I downloaded and installed the newest version of VSCode(deb version) and everything is fine now. (I don't know why but maybe you can give it a try! Good Luck!)

Source https://stackoverflow.com/questions/70177216

QUESTION

Kubernetes NodePort is not available on all nodes - Oracle Cloud Infrastructure (OCI)

Asked 2022-Jan-31 at 14:37

I've been trying to get over this but I'm out of ideas for now hence I'm posting the question here.

I'm experimenting with the Oracle Cloud Infrastructure (OCI) and I wanted to create a Kubernetes cluster which exposes some service.

The goal is:

A running managed Kubernetes cluster (OKE)
2 nodes at least
1 service that's accessible for external parties

The infra looks the following:

A VCN for the whole thing
A private subnet on 10.0.1.0/24
A public subnet on 10.0.0.0/24
NAT gateway for the private subnet
Internet gateway for the public subnet
Service gateway
The corresponding security lists for both subnets which I won't share right now unless somebody asks for it
A containerengine K8S (OKE) cluster in the VCN with public Kubernetes API enabled
A node pool for the K8S cluster with 2 availability domains and with 2 instances right now. The instances are ARM machines with 1 OCPU and 6GB RAM running Oracle-Linux-7.9-aarch64-2021.12.08-0 images.
A namespace in the K8S cluster (call it staging for now)
A deployment which refers to a custom NextJS application serving traffic on port 3000

And now it's the point where I want to expose the service running on port 3000.

I have 2 obvious choices:

Create a LoadBalancer service in K8S which will spawn a classic Load Balancer in OCI, set up it's listener and set up the backendset referring to the 2 nodes in the cluster, plus it adjusts the subnet security lists to make sure traffic can flow
Create a Network Load Balancer in OCI and create a NodePort on K8S and manually configure the NLB to the ~same settings as the classic Load Balancer

The first one works perfectly fine but I want to use this cluster with minimal costs so I decided to experiment with option 2, the NLB since it's way cheaper (zero cost).

Long story short, everything works and I can access the NextJS app on the IP of the NLB most of the time but sometimes I couldn't. I decided to look it up what's going on and turned out the NodePort that I exposed in the cluster isn't working how I'd imagine.

The service behind the NodePort is only accessible on the Node that's running the pod in K8S. Assume NodeA is running the service and NodeB is just there chilling. If I try to hit the service on NodeA, everything is fine. But when I try to do the same on NodeB, I don't get a response at all.

That's my problem and I couldn't figure out what could be the issue.

What I've tried so far:

Switching from ARM machines to AMD ones - no change
Created a bastion host in the public subnet to test which nodes are responding to requests. Turned out only the node responds that's running the pod.
Created a regular LoadBalancer in K8S with the same config as the NodePort (in this case OCI will create a classic Load Balancer), that works perfectly
Tried upgrading to Oracle 8.4 images for the K8S nodes, didn't fix it
Ran the Node Doctor on the nodes, everything is fine
Checked the logs of kube-proxy, kube-flannel, core-dns, no error
Since the cluster consists of 2 nodes, I gave it a try and added one more node and the service was not accessible on the new node either
Recreated the cluster from scratch

Edit: Some update. I've tried to use a DaemonSet instead of a regular Deployment for the pod to ensure that as a temporary solution, all nodes are running at least one instance of the pod and surprise. The node that was previously not responding to requests on that specific port, it still does not, even though a pod is running on it.

Edit2: Originally I was running the latest K8S version for the cluster (v1.21.5) and I tried downgrading to v1.20.11 and unfortunately the issue is still present.

Edit3: Checked if the NodePort is open on the node that's not responding and it is, at least kube-proxy is listening on it.

...

ANSWER

Answered 2022-Jan-31 at 12:06

Might not be the ideal fix, but can you try changing the externalTrafficPolicy to Local. This would prevent the health check on the nodes which don't run the application to fail. This way the traffic will only be forwarded to the node where the application is . Setting externalTrafficPolicy to local is also a requirement to preserve source IP of the connection. Also, can you share the health check config for both NLB and LB that you are using. When you change the externalTrafficPolicy, note that the health check for LB would change and the same needs to be applied to NLB.

Edit: Also note that you need a security list/ network security group added to your node subnet/nodepool, which allows traffic on all protocols from the worker node subnet.

Source https://stackoverflow.com/questions/70893487

QUESTION

Why is ArgoCD confusing GitHub.com with my own public IP?

Asked 2022-Jan-10 at 17:37

I have just set up a kubernetes cluster on bare metal using kubeadm, Flannel and MetalLB. Next step for me is to install ArgoCD.

I installed the ArgoCD yaml from the "Getting Started" page and logged in.

When adding my Git repositories ArgoCD gives me very weird error messages: The error message seems to suggest that ArgoCD for some reason is resolving github.com to my public IP address (I am not exposing SSH, therefore connection refused).

I can not find any reason why it would do this. When using https:// instead of SSH I get the same result, but on port 443.

I have put a dummy pod in the same namespace as ArgoCD and made some DNS queries. These queries resolved correctly.

What makes ArgoCD think that github.com resolves to my public IP address?

EDIT:

I have also checked for network policies in the argocd namespace and found no policy that was restricting egress.

I have had this working on clusters in the same network previously and have not changed my router firewall since then.

...

ANSWER

Answered 2022-Jan-08 at 21:04

That looks like argoproj/argo-cd issue 1510, where the initial diagnostic was that the cluster is blocking outbound connections to GitHub. And it suggested to check the egress configuration.

Yet, the issue was resolved with an ingress rule configuration:

need to define in values.yaml.
argo-cd default provide subdomain but in our case it was /argocd

Source https://stackoverflow.com/questions/70600322

QUESTION

System.AggregateException thrown when trying to access server on Xamarin.Forms Android

Asked 2021-Dec-23 at 15:06

I use a HttpClient to communicate with my server as shown below:

...

ANSWER

Answered 2021-Dec-23 at 15:06

This is a long-standing issue #6351 in Xamarin.Android, caused by LetsEncrypt's root expiring and them moving to a new root.

Below is a copy of my post in that issue explaining the situation and the workarounds. See other posts in that thread for details on the workarounds.

Scott Helme has a fantastic write-up of the situation. Go and read that first, then I'll describe how (I think) this applies to xamarin-android.

I'm going to copy the key diagram from that article (source):

The red chain is what used to happen: the IdenTrust DST Root CA X3 is an old root certificate which is trusted pretty much everywhere, including on Android devices from 2.3.6 onwards. This is what LetsEncrypt used to use as their root, and it meant that everyone trusted them. However, this IdenTrust DST Root CA X3 recently expired, which means that a bunch of devices won't trust anything signed by it. LetsEncrypt needed to move to their own root certificate.

The blue chain is the ideal new one -- the ISRG Root X1 is LetsEncrypt's own root certificate, which is included on Android 7.1.1+. Android devices >= 7.1.1 will trust certificates which have been signed by this ISRG Root X1.

However, the problem is that old pre-7.1.1 Android devices don't know about ISRG Root X1, and don't trust this.

The workaround which LetsEncrypt is using is that old Android devices don't check whether the root certificate has expired. They therefore by default serve a chain which includes LetsEncrypt's root ISRG Root X1 certificate (which up-to-date devices trust), but also include a signature from that now-expired IdenTrust DST Root CA X3. This means that old Android devices trust the chain (as they trust the IdenTrust DST Root CA X3, and don't check whether it's expired), and newer devices also trust the chain (as they're able to work out that even though the root of the chain has expired, they still trust that middle ISRG Root X1 certificate as a valid root in its own right, and therefore trust it).

This is the green path, the one which LetsEncrypt currently serves by default.

However, the BoringSSL library used by xamarin-android isn't Android's SSL library. It 1) Doesn't trust the IdenTrust DST Root CA X3 because it's expired, and 2) Isn't smart enough to figure out that it does trust the ISRG Root X1 which is also in the chain. So if you serve the green chain in the image above, it doesn't trust it. Gack.

The options therefore are:

Don't use BoringSSL and do use Android's SSL library. This means that xamarin-android behaves the same as other Android apps, and trusts the expired root. This is done by using AndroidClientHandler as described previously. This should fix Android >= 2.3.6.
Do use BoringSSL but remove the expired IdenTrust DST Root CA X3 from Android's trust store ("Digital Signature Trust Co. - DST Root CA X3" in the settings). This tricks BoringSSL into stopping its chain at the ISRG Root X1, which it trusts (on Android 7.1.1+). However this will only work on Android devices which trust the ISRG Root X1, which is 7.1.1+.
Do use BoringSSL, and change your server to serve a chain which roots in the ISRG Root X1, rather than the expired IdenTrust DST Root CA X3 (the blue chain in the image above), using --preferred-chain "ISRG Root X1". This means that BoringSSL ignores the IdenTrust DST Root CA X3 entirely, and roots to the ISRG Root X1. This again will only work on Android devices which trust the ISRG Root X1, i.e. 7.1.1+.
Do the same as 3, but by manually editing fullchain.pem.
Use another CA such as ZeroSSL, which uses a root which is trusted back to Android 2.2, and which won't expire until 2038.

Source https://stackoverflow.com/questions/70463199

QUESTION

Ping Tasks will not complete

Asked 2021-Nov-30 at 13:11

I am working on a "heartbeat" application that pings hundreds of IP addresses every minute via a loop. The IP addresses are stored in a list of a class Machines. I have a loop that creates a Task (where MachinePingResults is basically a Tuple of an IP and online status) for each IP and calls a ping function using System.Net.NetworkInformation.

The issue I'm having is that after hours (or days) of running, one of the loops of the main program fails to finish the Tasks which is leading to a memory leak. I cannot determine why my Tasks are not finishing (if I look in the Task list during runtime after a few days of running, there are hundreds of tasks that appear as "awaiting"). Most of the time all the tasks finish and are disposed; it is just randomly that they don't finish. For example, the past 24 hours had one issue at about 12 hours in with 148 awaiting tasks that never finished. Due to the nature of not being able to see why the Ping is hanging (since it's internal to .NET), I haven't been able to replicate the issue to debug.

(It appears that the Ping call in .NET can hang and the built-in timeout fail if there is a DNS issue, which is why I built an additional timeout in)

I have a way to cancel the main loop if the pings don't return within 15 seconds using Task.Delay and a CancellationToken. Then in each Ping function I have a Delay in case the Ping call itself hangs that forces the function to complete. Also note I am only pinging IPv4; there is no IPv6 or URL.

Main Loop

...

ANSWER

Answered 2021-Nov-26 at 08:37

There are quite a few gaps in the code posted, but I attempted to replicate and in doing so ended up refactoring a bit.

This version seems pretty robust, with the actual call to SendAsync wrapped in an adapter class.

I accept this doesn't necessarily answer the question directly, but in the absence of being able to replicate your problem exactly, offers an alternative way of structuring the code that may eliminate the problem.

Source https://stackoverflow.com/questions/70097065

QUESTION

Address already in use for puma-dev

Asked 2021-Nov-16 at 11:46

Problem

Whenever I try to run

...

ANSWER

Answered 2021-Nov-16 at 11:46

Why

Well, this is interesting. I did not think of searching for lsof's COMMAND column, before.

Turns out, ControlCe means "Control Center" and beginning with Monterey, macOS does listen ports 5000 & 7000 on default.

Solution

Go to System Preferences > Sharing
Uncheck AirPlay Receiver.
Now, you should be able to restart puma as usual.

Source: https://developer.apple.com/forums/thread/682332

Source https://stackoverflow.com/questions/69718732

QUESTION

The pgAdmin 4 server could not be contacted: Fatal error

Asked 2021-Nov-04 at 12:37

I upgrade PostgreSQL from 13.3 to 13.4 and got a fatal error by pgAdmin 4. I found other similar question that try to fix the problem deleting the folder: "C:\Users\myusername\AppData\Roaming\pgadmin\sessions" and running pgAdmin as admin but nothing happen. Also i completely remove postgres and reinstall it, and i installed pgAdmin with his separate installation, but nothing happen again. This is the error:

...

ANSWER

Answered 2021-Sep-11 at 18:16

This is something that seem to have changed between pgAdmin4 5.1 and 5.7. I've seen this on a machine that had been connected to a WiFi mobile hotspot (but it could happen in other circumstances).

It has something to do with the way the dns library is used on Windows, so this could happen to other applications that use it in the same way.

Essentially, dns.Resolver scans the Windows registry for all network interfaces found under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\

The WiFi mobile hotspot that machine had been connected to had set a DhcpDomain key with value ".home". The dns.Resolver found this value and split it using the dot into multiple labels, one of them being empty. That caused the exception you mention: dns.name.EmptyLabel: A DNS label is empty.

This occurred even when the WiFi network was turned off: those were the last settings that had been in use and dns.Resolver didn't check whether the interface was enabled.

The latest version of pgAdmin seems to be an older version of dnspython (1.16.0), so I'm not sure whether this has been fixed in more recent versions. For now, there seems to be two options:

Delete or change the DhcpDomain subkey if you find it in on of the subkeys of HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\ (there might even be a way to force that value through the Control Panel).
Connect to a different network that doesn't set this value.

Source https://stackoverflow.com/questions/68862998

QUESTION

What is the purpose of the "[fingerprint]" option during SSH host authenticity check?

Asked 2021-Oct-25 at 12:12

when connecting to a git repository using SSH for the first time, it is asked to confirm the authenticity of the host according to its fingerprint:

...

ANSWER

Answered 2021-Oct-22 at 12:43

each ssh server have host ssh keys, which are used for

auth host and later check that you are connecting to the same host
to establish secure connection (exchange credentials in secure way)

So first time you are connecting to any ssh server, you will get public key and fingerprint of this key, and proposition to store fingerprint in "known hosts" file.

fingerprint is a new option just in addition to "yes", so you can provide fingerprint manually if you have received it in other way. https://github.com/openssh/openssh-portable/commit/05b9a466700b44d49492edc2aa415fc2e8913dfe

seems manpages is not updated yet.

Source https://stackoverflow.com/questions/69675690

QUESTION

Does .NET Framework have an OS-independent global DNS cache?

Asked 2021-Oct-15 at 12:00

Introduction

First of all, I've tried all recommendations from C# DNS-related SO threads and other internet articles - messing with ServicePointManager/ServicePoint settings, setting automatic request connection close via HTTP headers, changing connection lease times - nothing helped. It seems like all those settings are intended for fixing DNS issues in long-running processes (like web services). It even makes sense if a process would have it's own DNS cache to minimize DNS queries or OS DNS cache reading. But it's not my case.

The problem

Our production infrastructure uses HA (high availability) DNS for swapping server nodes during maintenance or functional problems. And it's built in a way that in some places we have multiple CNAME-records which in fact point to the same HA A-record like that:

eu.site1.myprodserver.com (CNAME) > eu.ha.myprodserver.com (A)
eu.site2.myprodserver.com (CNAME) > eu.ha.myprodserver.com (A)

The TTL of all these records is 60 seconds. So when the European node is in trouble or maintenance, the A-record switches to the IP address of some other node.

Then we have a monitoring utility which is executed once in 5 minutes and uses both site1 and site2. For it to work properly both names must point to the same DC, because data sync between DCs doesn't happen that fast. Since both CNAMEs are in fact linked to the same A-record with short TTL at a first glance it seems like nothing can go wrong. But it turns out it can.

The utility is written in C# for .NET Framework 4.7.2 and uses HttpClient class for performing requests to both sites. Yeah, it's him again.

We have noticed that when a server node switch occurs the utility often starts acting as if site1 and site2 were in different DCs. The pattern of its behavior in such moments is strictly determined, so it's not like it gets confused somewhere in the middle of the process - it incorrecly resolves one or both of these addresses from the very start.

I've made another much simpler utility which just sends one GET-request to site1 and then started intentionally switching nodes on and off and running this utility to see which DC would serve its request. And the results were very frustrating.

Despite the Windows DNS cache already being updated (checked via ipconfig and Get-DnsClientCache cmdlet) and despite the overall records' TTL of 60 seconds the HttpClient keeps sending requests to the old IP address sometimes for another 15-20 minutes. Even when I've completely shut down the "outdated" application server - the utility kept trying to connect to it, so even connection failures don't wake it up.

It becomes even more frustrating if you start running ipconfig /flushdns in between utility runs. Sometimes after flushdns the utility realizes that the IP has changed. But as soon as you make another flushdns (or this is even not needed - I haven't 100% clearly figured this out) and run the utility again - it goes back to the old address! Unbelievable!

And add even more frustration. If you resolve the IP address from within the same utility using Dns.GetHostEntry method (which uses cache as per this comment) right before calling HttpClient, the resolve result would be correct... But the HttpClient would anyway make a connection to an IP address of seemengly his own independent choice. So HttpClient somehow does not seem to rely on built-in .NET Framework DNS resolving.

So the questions are:

Where does a newly created .NET Framework process take those cached DNS results from?
Even if there is some kind of a mystical global .NET-specific DNS cache, then why does it absolutely ignore TTL?
How is it possible at all that it reverts to the outdated old IP address after it has already once "understood" that the address has changed?

P.S. I have worked this all around by implementing a custom HttpClientHandler which performs DNS queries on each hostname's first usage thus it's independent from external DNS caches (except for caching at intermediate DNS servers which also affects things to some extent). But that was a little tricky in terms of TLS certificates validation and the final solution does not seem to be production ready - but we use it for monitoring only so for us it's OK. If anyone is interested in this, I'll show the class code which somewhat resembles this answer's example.

Update 2021-10-08

The utility works from behind a corporate proxy. In fact there are multiple proxies for load balancing. So I am now also in process of verifying this:

If the DNS resolving is performed by the proxies and they don't respect the TTL or if they cache (keep alive) TCP connections by hostnames - this would explain the whole problem
If it's possible that different proxies handle HTTP requests on different runs of the utility - this would answer the most frustrating question #3

Update 2021-10-15

The answer to "Does .NET Framework has an OS-independent global DNS cache?" is NO. HttpClient class or .NET Framework in general had nothing to do with all of this. Posted my investigation results as an accepted answer.

...

ANSWER

Answered 2021-Oct-14 at 21:32

HttpClient, please forgive me! It was not your fault!

Well, this investigation was huge. And I'll have to split the answer into two parts since there turned out to be two unconnected problems.

1. The proxy server problem

As I said, the utility was being tested from behind a corporate proxy. In case if you haven't known (like I haven't till the latest days) when using a proxy server it's not your machine performing DNS queries - it's the proxy server doing this for you.

I've made some measurements to understand for how long does the utility keep connecting to the wrong DC after the DNS record switch. And the answer was the fantastic exact 30 minutes. This experiment has also clearly shown that local Windows DNS cache has nothing to do with it: those 30 minutes were starting exactly at the point when the proxy server was waking up (was finally starting to send HTTP requests to the right DC).

The exact number of 30 minutes has helped one of our administrators to finally figure out that the proxy servers have a configuration parameter of minimal DNS TTL which is set to 1800 seconds by default. So the proxies have their own DNS cache. These are hardware Cisco proxies and the admin has also noted that this parameter is "hidden quite deeply" and is not even mentioned in the user manual.

As soon as the minimal proxies' DNS TTL was changed from 1800 seconds to 1 second (yeah, admins have no mercy) the issue stopped reproducing on my machine.

But what about "forgetting" the just-understood correct IP address and falling back to the old one?

Well. As I also said there are several proxies. There is a single corporate proxy DNS name, but if you run nslookup for it - it shows multiple IPs behind it. Each time the proxy server's IP address is resolved (for example when local cache expires) - there's quite a bit of a chance that you'll jump onto another proxy server.

And that's exactly what ipconfig /flushdns has been doing to me. As soon as I started playing around with proxy servers using their direct IP addresses instead of their common DNS name I found that different proxies may easily route identical requests to different DCs. That's because some of them have those 30-minutes-cached DNS records while others have to perform resolving.

Unfortunately, after the proxies theory has been proven, another news came in: the production monitoring servers are placed outside of the corporate network and they do not use any proxy servers. So here we go...

2. The short TTL and public DNS servers problem

The monitoring servers are configured to use 8.8.8.8 and 8.8.4.4 Google's DNS servers. Resolve responses for our short-lived DNS records from these servers are somewhat weird:

The returned TTL of CNAME records swings at around 1 hour mark. It gradually decreases for several minutes and then jumps back to 3600 seconds - and so on.
The returned TTL of the root A-record is almost always exactly 60 seconds. I was occasionally receiving various numbers less than 60 but there was no any obvious humanly-percievable logic. So it seems like these IP addresses in fact point to balancers that distribute requests between multiple similar DNS servers which are not synced with each other (and each of them has its own cache).

Windows is not stupid and according to my experiments it doesn't care about CNAME's TTL and only cares about the root A-record TTL, so its client cache even for CNAME records is never assigned a TTL higher than 60 seconds.

But due to the inconsistency (or in some sense over-consistency?) of the A-record TTL which Google's servers return (unpredictable 0-60 seconds) the Windows local cache gets confused. There were two facts which demonstrated it:

Multiple calls to Resolve-DnsName for site1 and site2 over several minutes with random pauses between them have eventually led to Get-ClientDnsCache showing the local cache TTLs of the two site names diverged on up to 15 seconds. This is a big enough difference to sometimes mess the things up. And that's just my short experiment, so I'm quite sure that it might actually get bigger.
Executing Invoke-WebRequest to each of the sites one right after another once in every 3-5 seconds while switching the DNS records has let me twicely face a situation when the requests went to different DCs.

The latter experiment had one strange detail I can't explain. Calling Get-DnsClientCache after Invoke-WebRequest shows no records appear in the local cache for the just-requested site names. But anyway the problem clearly has been reproduced.

Conclusion?

It would take time to see whether my workaround with real-time DNS resolving would bring any improvement. Unfortunately, I don't believe it will - the DNS servers used at production (which would eventually be used by the monitoring utility for real-time IP resolving) are public Google DNS which are not reliable in my case.

And one thing which is worse than an intermittently failing monitoring utility is that real-world users are also relying on public DNS servers and they definitely do face problems during our maintenance works or significant failures.

So have we learned anything out of all this?

Maybe a short DNS TTL is generally a bad practice?
Maybe we should install additional routers, assign them static IPs, attach the DNS names to them and then route traffic internally between our DCs to finally stop relying on DNS records changing?
Or maybe public DNS servers are doing a bad job?
Or maybe the technological singularity is closer than we think?

I have no idea. But its quite possible that "yes" is the right answer to all of these questions.

However there is one thing we surely have learned: network hardware manufacturers shall write their documentation better.

Source https://stackoverflow.com/questions/69464848

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install dns

⚠️ Raspberry Pi users running 32 bit systems, you need to do this on your host to run the container. Launch the container with.
⚠️ Raspberry Pi users running 32 bit systems, you need to do this on your host to run the container.
Launch the container with docker run -d -p 53:53/udp qmcgaw/dns You can also use docker-compose.yml with: docker-compose up -d More environment variables are described in the environment variables section.
See the Connect clients to it section, you can also refer to the Verify DNS connection section if you want.
Install Go, Docker and Git.
Install Docker On Windows, share a drive with Docker Desktop and have the project on that partition On OSX, share your project directory with Docker Desktop
With Visual Studio Code, install the remote containers extension
In Visual Studio Code, press on F1 and select Remote-Containers: Open Folder in Container...
Your dev environment is ready to go!... and it's running in a container :+1:
Install Go, Docker and Git
Install dependencies go mod download
Install golangci-lint

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: