python-whois | A python module for retrieving and parsing WHOIS data | Parser library
kandi X-RAY | python-whois Summary
kandi X-RAY | python-whois Summary
A WHOIS retrieval and parsing library for Python.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Get the WHOIS information from a domain
- Normalize a domain name
- Remove prefixes from a list
- Removes duplicate entries from a list
- Parse raw_data
- Parse registrants
- Normalize data
- Parse a list of dates
- Get a raw WHOIS server
- Returns the root URL for the given domain
- Perform a WHOIS request
- Fetch the nic contact from the server
- Parse nic contact
- Read a dataset from a csv file
- Get pkgdata package data
- Preprocess regular expression
- Return the package data
- Compile regexes
python-whois Key Features
python-whois Examples and Code Snippets
Community Discussions
Trending Discussions on python-whois
QUESTION
TL;DR: I need a source for as many different output formats from a whois query as possible.
Background:
- I am looking for a single reference that can provide as many (if not all) unique whois query output formats as possible.
- I don't believe this exists but hope to be proven wrong.
- This appears to be an age old problem
- This stackoverflow post from 2015 references the challenge of handling the "~40 formats" that the author was aware of.
- The author never detailed any of these formats.
- The RFC for whois is... depressing
- The IETF ran an analysis in 2015 that examined the components of whois per each RIR at the time
- In my own research I see that registrars like JPNIC do not appear to comply with the APNIC standards
- This stackoverflow post from 2015 references the challenge of handling the "~40 formats" that the author was aware of.
I am aware of existing tools that do a bang-up job parsing whois (python-whois for example) however I'd like to hedge my bets against outliers with odd formats. I'm also open to possible approaches to gather this information, however that would likely be too broad to fit this question.
Hoping there is a simple "go here and download this" answer. Hoping...
...ANSWER
Answered 2020-Jul-14 at 17:55"TL;DR: I need a source for as many different output formats from a whois query as possible."
There isn't, except if you use any kind of provider that does this for you, with whatever caveats. Or more precisely there isn't something public, maintained and exhaustive. You can find various libraries that try to do this, in various languages, but none is complete, as this is basically an impossible task, especially if you want to include any TLDs, like ccTLDs (you are not framing your constraints space in a very detailed way, nor in fact really saying you are asking about domain name data in whois or IP addresses/ASN data?).
Some providers of course try to do that and offering you an abstract uniform API. But why would anyone share their internal secret sauce, that is list of parsers and so on? It makes no business incentive to do that.
As for opensource library authors (I was one at some point), it is just tedious and absolutely not rewarding at all to just update it forever with all new formats and tweaks per registry (battle scar example: one registrar in the past changed its output format at each query! one query gave you somefield: somevalue
while next time it was somefield:somevalue
or somefield somevalue
, etc. of course that is only a simple example).
RFC 3912 specified just the transport part, not the content, hence a lot of cases appeared. Specifically in the ccTLD world, each registry is king in its kingdom and it is free to implement whatever it wants the way it wants. Also the protocol had some serious limitations (ex: internationalization, what is the "charset" used for the underlying data) that were circumvented in different ways (like passing "options" in your query... of course none of them are standardized in any way)
At the very least, gTLDs whois format is specified there: https://www.icann.org/resources/pages/approved-with-specs-2013-09-17-en#whois
Note however that due to GDPR there were changes (see https://www.icann.org/resources/pages/gtld-registration-data-specs-en/#temp-spec) and will be other changes in the future.
However, you should be highly pressed to look at RDAP instead of whois.
RDAP is now a requirement in all gTLDs registries and registries. As it is JSON, it solves immediately the problem of format.
Its core specifications are:
- RFC 7480 HTTP Usage in the Registration Data Access Protocol (RDAP)
- RFC 7481 Security Services for the Registration Data Access Protocol (RDAP)
- RFC 7482 Registration Data Access Protocol (RDAP) Query Format
- RFC 7483 JSON Responses for the Registration Data Access Protocol (RDAP)
- RFC 7484 Finding the Authoritative Registration Data (RDAP) Service
You can find various libraries doing RDAP for you (see below for links), but at its core it is JSON over HTTPS so you can emulate simple cases with any kind of HTTP client library.
Work is underway to fix some missing/not precise enough details on RFC 7482 and 7483.
You need also to take into account ICANN specifications (again, only for gTLDs of course):
- https://www.icann.org/en/system/files/files/rdap-technical-implementation-guide-15feb19-en.pdf
- https://www.icann.org/en/system/files/files/rdap-response-profile-15feb19-en.pdf
Note that, right now, even if it is an ICANN requirement, you will find a lot of missing or broken gTLD registries or registrar RDAP server. You will also find a lot of "deviations" in replies from what would be expected per the specification.
I gave full details in various other questions here, so maybe have a look:
- https://stackoverflow.com/a/61877920/6368697
- https://stackoverflow.com/a/48066735/6368697
- https://webmasters.stackexchange.com/a/115605/75842
- https://security.stackexchange.com/a/213854/137710
- https://serverfault.com/a/999095/396475
PS: philosophical question on "Hoping there is a simple "go here and download this" answer. Hoping..." because a lot of people hoped for that in the past, and see initial remark at beginning. Let us imagine you go forward and build this magnificent resource with all exhaustive details. Would you be inclined to just share it with anyone, for free? The answer is probably no, for obvious reasons, so the same happened in the past for others that went on the same path as you, and hence the results of now various providers offering you more or less this service (you would need to find details on which formats are parsed, the rate limites, the prices, etc.), but nothing freely available to share.
Now you can just dream/hope that every registries and registrars switch to RDAP AND implement it properly. Then the problem of format is solved once for all. However, the above requirements ("every" + "properly") are not small, and may not happen "soon". Specifically in ccTLDs, where registries are in no way mandated by any external force (except market pressure?) to implement RDAP at all.
QUESTION
I get this error saying
...ANSWER
Answered 2019-Aug-28 at 10:18Install the dependency with pip - "pip install python-whois"
QUESTION
I need a reliable way to check in Python if a domain of any TLD has been registered or is available. The bold phrases are the key points that I'm struggling with.
What I tried?- WHOIS is the obvious way to do the check and an existing Python library like the popular python-whois was my first try. The problem is that it doesn't seem to be able to retrieve information for some of the TLDs, e.g. .run, while it works mostly fine for older ones, e.g. .com.
- So if python-whois is not reliable, maybe just a wrapper for the Linux's whois would be better. I tried whois library and unfortunately it supports only a limited set of TLDs, apparently to make sure it can always parse the results.
As I don't really need to parse the results, I ripped the code out of the whois library and tried to do the query by calling Linux's whois myself:
...
ANSWER
Answered 2018-Jan-04 at 15:38If you do not have specific access (like being a registrar), and if you do not target a specific TLD (as some TLDs do have a specific public service called domain availability), the only tool that makes sense is to query whois servers.
You have then at least the following two problems:
- use the appropriate whois server based on the given domain name
- taking into account that whois servers are rate-limited so if you are bulk querying them without care you will first hit delays and then even risk your IP to be blacklisted, for some time.
For the second point the usual methods apply (handling delays on your side, using multiple endpoints, etc.)
For the first point, in another of my reply here: https://unix.stackexchange.com/a/407030/211833 you could find some explanations of what you observe depending on the wrapper around whois you use and some counter measures. See also my other reply here: https://webmasters.stackexchange.com/a/111639/75842 and specifically point 2.
Note that depending on your specific requirements and if you are able to at least change part of them, you may have other solutions. For example, for gTLDs, if you tolerate 24 hours delay, you may use the published zonefiles of registries to find domain names registered (those published so not all of them).
Also, why you are right in a generic sense that using a third party has its weaknesses, if you find a worthy registrar that both has access to many registries and that provides you with an API, you could then use it for your needs.
In short, I do not believe you can achieve this task with all cases (100% reliability, 100% TLDs, etc.). You will need some compromises but they depend on your initial needs.
Also very important: do not shell out to run a whois command, this will create many security and performance problems. Use the appropriate libraries from your programming language to do whois queries or just open a TCP socket on port 43 and send your queries on one line terminated by CR+LF, reading back a blob of text, this is basically only what is defined in RFC3912.
QUESTION
I have a function that looks like this, it looks up the domain on who.is when given a url:
...ANSWER
Answered 2017-May-13 at 13:58Apparently, you're using python-whois.
Look at the example. You can get all the data in a structured form, rather than a text you'd need to parse:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install python-whois
You can use python-whois like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page