GithubHelp home page GithubHelp logo

richardpenman / whois Goto Github PK

View Code? Open in Web Editor NEW
319.0 11.0 172.0 420 KB

License: MIT License

Python 87.43% DIGITAL Command Language 9.12% Cool 0.08% Crystal 0.97% Stata 1.24% NewLisp 0.43% Ruby 0.20% Perl 0.27% Common Lisp 0.14% Raku 0.13%

whois's Introduction

Goal

  • Create a simple importable Python module which will produce parsed WHOIS data for a given domain.
  • Able to extract data for all the popular TLDs (com, org, net, ...)
  • Query a WHOIS server directly instead of going through an intermediate web service like many others do.
  • Works with Python 2 & 3

Example

>>> import whois
>>> w = whois.whois('example.com')
>>> w.expiration_date  # dates converted to datetime object
datetime.datetime(2022, 8, 13, 4, 0)
>>> w.text  # the content downloaded from whois server
u'\nDomain Name: EXAMPLE.COM
Registry Domain ID: 2336799_DOMAIN_COM-VRSN
...'

>>> print w  # print values of all found attributes
{
    "creation_date": "1995-08-14 04:00:00",
    "expiration_date": "2022-08-13 04:00:00",
    "updated_date": "2021-08-14 07:01:44",
    "domain_name": "EXAMPLE.COM",
    "name_servers": [
        "A.IANA-SERVERS.NET",
        "B.IANA-SERVERS.NET"
    ],
...

Install

Install from pypi:

$ pip install python-whois

Or checkout latest version from repository:

$ git clone [email protected]:richardpenman/whois.git

Note that then you will need to manually install the futures module, which allows supporting both Python 2 & 3:

$ pip install futures

Run test cases for python 2 & 3:

$ python -m unittest discover test
.............
----------------------------------------------------------------------
Ran 13 tests in 0.812s

OK

$ python3 -m unittest discover test
.............
----------------------------------------------------------------------
Ran 13 tests in 1.431s

OK

SOCKS Proxy support requirements:

$ pip install PySocks
............
---------------------------------------------------------------------
$ export SOCKS=socksproxy.someplace.com:8080

Problems?

Pull requests are welcome!

Thanks to the many who have sent patches for additional TLDs. If you want to add or fix a TLD it's quite straightforward. See example domains in whois/parser.py

Basically each TLD has a similar format to the following:

class WhoisOrg(WhoisEntry):
"""Whois parser for .org domains
"""
regex = {
    'domain_name':      'Domain Name: *(.+)',
    'registrar':        'Registrar: *(.+)',
    'whois_server':     'Whois Server: *(.+)',
    ...
}

def __init__(self, domain, text):
    if text.strip() == 'NOT FOUND':
        raise PywhoisError(text)
    else:
        WhoisEntry.__init__(self, domain, text)

whois's People

Contributors

abhint avatar apocalyptech avatar augustin-fl avatar bochecha avatar chesnovskii avatar clan avatar creffett avatar dfeinzeig avatar droe avatar ev01ing avatar gremur avatar gronke avatar hardenchant avatar kylejohnson avatar mhtr avatar mmurphy-studentbridge avatar mzpqnxow avatar onurgule avatar pcanterino avatar pipozzz avatar pnmartinez avatar rensoliemans avatar rez0n avatar rhooper avatar richardpenman avatar sercanbayrambey avatar sk-rama avatar smpatil avatar vivekhub avatar willymwai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

whois's Issues

Some .io domains returns "Error trying to connect to socket: closing socket" but still works

Hi,

I'm evaluating this component and it does work but for every query to some specific .io domains returns "Error trying to connect to socket: closing socket" but it will still be able to get whois information.

The dns lookup for io returns whois.afilias.net and if I run whois.py -h whois.afilias.net .io then I got the result without any errors.

Note that google.io does not behave like this.

Are registrar used somehow and different domains behaves different?

I can provide domain name in PM if needed

Best regards Johan

.ua domains might erroneously end up with two creation dates

One of the tests in the test suite is broken: the expectation for the creation_date field is to have a list of two dates.

This is actually a bug in the behaviour of the .ua parser : we're getting the correct creation date, but additionally we're also grabbing the creation_date of one of the contacts in the response.

Here's an example of this happening. In the whois response, we should only be expecting to see datetime.datetime(2002, 12, 4, 0, 0) as a response, but there is an additional one:

In [1]: import whois

In [2]: whois.whois("google.com.ua")
Out[2]: 
{'domain_name': 'google.com.ua',
 'status': ['clientDeleteProhibited',
  'clientTransferProhibited',
  'clientUpdateProhibited',
  'ok',
  'linked'],
 'registrar': 'MarkMonitor Inc.',
 'registrar_name': 'ua.markmonitor',
 'registrar_url': 'http://markmonitor.com',
 'registrar_country': 'US',
 'registrar_city': 'Meridian, Idaho',
 'registrar_address': 'US 83642 Meridian, Idaho 2150 S. Bonito Way, Suite 150',
 'registrar_email': '[email protected]',
 'registrant_name': 'Google LLC',
 'registrant_country': 'US',
 'registrant_city': 'Mountain View',
 'registrant_state': 'CA',
 'registrant_address': '1600 Amphitheatre Parkway',
 'registrant_email': '[email protected]',
 'registrant_postal_code': '94043',
 'registrant_phone': '+1.6502530000',
 'registrant_fax': '+1.6502530001',
 'admin': 'Google LLC',
 'admin_country': 'US',
 'admin_city': 'Mountain View',
 'admin_state': 'CA',
 'admin_address': '1600 Amphitheatre Parkway',
 'admin_email': '[email protected]',
 'admin_postal_code': '94043',
 'admin_phone': '+1.6502530000',
 'admin_fax': '+1.6502530001',
 'updated_date': datetime.datetime(2020, 12, 11, 1, 7, 19),
 'creation_date': [datetime.datetime(2002, 12, 4, 0, 0),
  datetime.datetime(2018, 2, 27, 21, 7, 26)],
 'expiration_date': datetime.datetime(2021, 12, 4, 0, 0),
 'name_servers': ['ns1.google.com',
  'ns2.google.com',
  'ns3.google.com',
  'ns4.google.com']}

The regexp for creation_date in WhoisUA needs to be made stricter so that it only catches the first occurence of ^created:. However, I tried to use the following but it didn't work. From what I understand, WhoisEntry.parse() ends up using re.findall() which make the back reference totally useless.

'creation_date':                  r'(?<!Registrant:)created: +(.+)',

I'm stumped about how to fix this.

Error for many TLDs

I was doing a loop in all the TLDs listed in IANA TLD list . It is failing for many TLDs with the following error.

>>> w = whois.whois("domainname.aarp")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/whois/__init__.py", line 43, in whois
    text = nic_client.whois_lookup(None, domain.encode('idna'), flags)
  File "/usr/local/lib/python3.8/dist-packages/whois/whois.py", line 264, in whois_lookup
    result = self.whois(query_arg, nichost, flags)
  File "/usr/local/lib/python3.8/dist-packages/whois/whois.py", line 142, in whois
    s.connect((hostname, 43))
socket.gaierror: [Errno -5] No address associated with hostname

Extract "Raw Whois Data" only

Is there nay way to only extract the "Raw Whois Data" part?
There are a lot of fields in that part not included, such as "Registrar Abuse Contact Email", which I would love to be able to access.
(I'm trying to convert the entire Raw Whois Data to an excel sheet, and this would be a nice way to do it.

DepractionWarning from implicitly required module imp

The from future import standard_library, that is widely used here in pywhois, hits an import imp in the future module here, which in my test setup results in a DeprecationWarning (my setup: pytest, Python 3.7.0, pytest-5.3.5, py-1.8.1, pluggy-0.13.1). I would like to use pywhois in my project, but the warning is undesirable...

I don't see a simple fix for this (If there is, I'd like pointers on how to contribute with it!) so I'm instead raising the question; Is it about time to drop support for py2 and thereby make it possibly to remove future as a dependency of pywhois?

From my point of view, py2 is dead and gone, so those who cling to it should not expect to get updates in libraries like pywhois.

FR: type annotations

Please add mypy type annotations. It would require dropping Python 2 support.

I can open a PR adding them if you accept this feature.

No parser for sg TLD

I've made a parser for .sg TLD and waiting for review and merge if it's possible.
Also I've updated whois server domain name for .de TLD with more informative one.
Please check #59

Add ability to specify path to whois executable

In a number of circumstances (including in particular on windows), it is advantageous to be able to specify an explicit executable path for the primary entrypoint. This means that instead of just calling "whois".

This change would only need to take place in the __init__.whois method as you can pass a path to the whois executable and just give it the default argument of "whois".:

def whois(url, command=False, flags=0, executable="whois"):
    # clean domain to expose netloc
    ip_match = IPV4_OR_V6.match(url)
    if ip_match:
        domain = url
        try:
            result = socket.gethostbyaddr(url)
        except socket.herror as e:
            pass
        else:
            domain = extract_domain(result[0])
    else:
        domain = extract_domain(url)
    if command:
        # try native whois command
        r = subprocess.Popen([executable, domain], stdout=subprocess.PIPE)
        text = r.stdout.read().decode()
    else:
        # try builtin client
        nic_client = NICClient()
        text = nic_client.whois_lookup(None, domain.encode('idna'), flags)
    return WhoisEntry.load(domain, text)

The alternative on windows is to download, extract and install the binary then modify the path variable to explicitly include it before you can use it.

More granular exceptions

It would be nice if the exceptions were more granular than the current PywhoisError - the code I'm working with would benefit from knowing more specifically what's gone wrong with a whois lookup. From looking at the code, I see three distinct classes of exception: TLD doesn't have a whois server (the exception in class WhoisEntry), invalid input (the exception in def load if we get "no whois server is known for this kind of object`), and domain couldn't be found in whois (the exception in the WhoisEntry subclasses). I'm thinking the best approach would be to have three subclasses of PywhoisError; that way, existing usage won't change behavior and it gives users an easy way to handle "I want to catch whois errors, I don't care what the meaning of the error is" cases but also lets people handle specific errors if they care. If this sounds like something you would like to add, let me know and I can prepare a patch.

Make postal code and state/province consistent

There are two fields which are used inconsistently but where it isn't trivial to choose.

Postal Code

The two fields used are:
registrant_postal_code, for example in WhoisClub, and
zipcode, for example in WhoisInfo.

I think that the .info zipcode field should be registrant_zipcode but that is a change for later.
For this issue, should we make postal_code the standard? It seems to be the field most used for this purpose.

State/Province

The field used are:
registrant_state_province, f.e. WhoisBiz, and
state, f.e. WhoisInfo.

Again, .info state should prob. be registrant_state, but that isn't the point of this issue.
What field should be the standard?

socket.gaierror: [Errno -2] Name or service not known on multiple TLDs

I'm receiving 'socket.gaierror: [Errno -2] Name or service not known' on multiple TLDs, but not every domain within the TLDs.

For example, webkit.org:
`

import whois
whois.whois('webkit.org')
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.7/dist-packages/whois/init.py", line 40, in whois
text = nic_client.whois_lookup(None, domain.encode('idna'), 0)
File "/usr/local/lib/python3.7/dist-packages/whois/whois.py", line 204, in whois_lookup
result = self.whois(query_arg, nichost, flags)
File "/usr/local/lib/python3.7/dist-packages/whois/whois.py", line 145, in whois
response += self.whois(query, nhost, 0)
File "/usr/local/lib/python3.7/dist-packages/whois/whois.py", line 114, in whois
s.connect((hostname, 43))
socket.gaierror: [Errno -2] Name or service not known
`

Not every .org though, because google.org works fine, as does a non-existent .org domain.

I've also noticed this on some .com.mx domains, for example test.com.mx:
`

whois.whois('test.com.mx')
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.7/dist-packages/whois/init.py", line 40, in whois
text = nic_client.whois_lookup(None, domain.encode('idna'), 0)
File "/usr/local/lib/python3.7/dist-packages/whois/whois.py", line 204, in whois_lookup
result = self.whois(query_arg, nichost, flags)
File "/usr/local/lib/python3.7/dist-packages/whois/whois.py", line 114, in whois
s.connect((hostname, 43))
socket.gaierror: [Errno -2] Name or service not known
`

I've tried this on multiple different machines and VMs and receive consistent results. I'm using Python3 and Linux kernel 4.15.

Lack of consistency in WHOIS results

Issue description

Performing the same WHOIS can randomly return successful results or PyErrors ("no match") for the same domain.

Results

I've performed a consistency study over a problematic list of domains: 25 WHOIS runs for the same 23 domains. Results can be seen in the next gspread sheet, or posted in the comments.

Code used

import whois
import csv

domains_to_check = ['campingfriends', 'Getawaytrips', 'bedtrip', 'Roadtrip', 'campingvacation', 'parkgroup', 'campingtrip', 'Destinationsvacation', 'everyoneland', 'groupfriends', 'camperspot', 'landvacation', 'landGetaway', 'tripsgroup', 'friendsspot', 'roadcamper', 'campingspot', 'rentingvan', 'bedvans', 'tentspot', 'tripDestinations', 'tripvan', 'caravanDestinations']

lookup = []
for word in domains_to_check:
    lookup.append([word]) # We prepare a list of lists: [[domain1, run1, run2, ...], ...]

n = 0
while n < 25: # 25 runs of the same WHOISs
    print(n) 
    i = 0
    for word in domains_to_check:      
        try:
            whois.whois(word + '.com') # Succesful WHOIS: the domain is not available
            w = 'not available'
        except:
            w = 'available' # Unsuccesful WHOIS, "No match" PyError: domain available
        lookup[i].append(w)        
        i += 1
    n += 1

 # Writing to CSV
for row in lookup:
    with open('looping_error_output.csv',"a") as file:
        writer = csv.writer(file)
        writer.writerow(row)
print(lookup)

whois.extract_domain doesn't handle second-level-only TLDs correctly

If a domain doesn't have any top-level TLDs in the public suffix list (see, for example, the .za TLD - there is no .za, only .co.za, .ac.za, etc.), then whois.extract_domain returns an incorrect result. For example, whois.extract_domain('google.co.za') returns 'za' instead of 'google.co.za'. Patch coming shortly.

Issues with South African domains

Hello- I recently started using your project to collect domain registrant information for a large number of domains spanning many countries. Among them are domains in South Africa. I am having problems with queries for them- the behavior I am getting is a failure to connect to what it thinks is the correct server for ZA domains. The connection times out. It's not an issue with a network block or rate limit as I've tried on a few different servers- it's simply receiving bad info back when it tries to find the correct whois server for .co.za it seems.

I haven't looked too much at the code (though I'm happy to) but I did take a look at what was going on with tcpdump and strace because the Debian whois client does OK with these domains. Here's what python-whois behavior looks like:

  • When it starts up, the DNS query goes out for za.whois-servers.net
  • This returns a CNAME to whois.nic.za, which resolves to 196.29.59.37
  • The connection to the whois service @ 196.29.59.37 on TCP/43 then hangs until timing out- it never establishes a connection, it's just SYN-SENT

When I use the native whois client that comes with Debian 10, I see that it uses a DNS query to whois.registry.net.za to find coza-whois.dns.net.za, which it then successfully queries on TCP/43

I'm more familiar with RDAP than WHOIS so I don't know if you should be using a diffferent "bootstrap" DNS name (if that's the term here?) to find the South Africa whois service or what. I just know it is not working and that this is the behavior

Any thoughts on this? I haven't found issues with any other countries (yet) but if I do I will throw them into this issue if you're interesting in trying to address them. If you don't have the time but will take PRs I will try to make time to do it myself but I would like to hear what you have to say about it before spending any time on it

Thanks, great project, appreciate your work

EDIT: Example domain: bounceking.co.za

Error trying to connect to socket: closing socket

Hi,
I found the following error during use, Below is my debug information.
my VPS is in Los Angeles, USA.
"""
Error trying to connect to live.whois-servers.net socket: closing socket
Error trying to connect to space.whois-servers.net socket: closing socket
Error trying to connect to store.whois-servers.net socket: closing socket
Error trying to connect to vip.whois-servers.net socket: closing socket
Error trying to connect to website.whois-servers.net socket: closing socket
Error trying to connect to cn.whois-servers.net socket: closing socket
Error trying to connect to whois.aliyun.com socket: closing socket
"""

Issue with .si domains

I have issues with two .si domains (leanpay.si and kupinaobroke.si).

When trying to use python-whois via gcloud, I get "Error trying to connect to socket: closing socket", and all fields return as null ("expiration_date": null, etc).

Is there a workaround for this, or can this be added perhaps?

Use custom whois server option

Hi,

I don't seem to be able to extract information from a referred query sent to a secondary whois server identified in whois_server response parameter. Is it possible to either expose an option to query a specific whois server directly or have a recursion option for the code to follow such redirect?

Something like the native tool does:
whois -h whois.corporatedomains.com XX.com

Thank you.

PyPi Release Behind But??

Seems like what is available on PyPi (0.7.1) is months behind this one here which is 0.7.0 but has recent commits? (got here from the PyPi -> Bitbucket -> Migrated Repo Link

And sublt name differance
pywhois (here)
vs python-whois (PyPi)

WhoisJp's creation_date field has multiple possible starting tags

It looks like the creation_date field from Japanese WHOIS servers can sometimes start with [Created on], but python-whois expects [Registered Date]. Example query and results from running whois on the command line:

whois news24.jp
[ JPRS database provides information on network administration. Its use is    ]
[ restricted to network administration purposes. For further information,     ]
[ use 'whois -h whois.jprs.jp help'. To suppress Japanese output, add'/e'     ]
[ at the end of command, e.g. 'whois -h whois.jprs.jp xxx/e'.                 ]

Domain Information:
[Domain Name]                   NEWS24.JP

[Registrant]                    Forecast Communications Inc.

[Name Server]                   ns-1420.awsdns-49.org
[Name Server]                   ns-1859.awsdns-40.co.uk
[Name Server]                   ns-77.awsdns-09.com
[Name Server]                   ns-915.awsdns-50.net
[Signing Key]

[Created on]                    2005/11/16
[Expires on]                    2020/11/30
[Status]                        Active
[Last Updated]                  2020/03/25 12:00:17 (JST)

Contact Information:
[Name]                          Forecast Communications Inc.
[Email]                         [email protected]
[Web Page]                      https://www.4cast.co.jp/
[Postal code]                   105-0021
[Postal Address]                Tokyo
                                Minato-ku
                                imaasa Building 3F, 1-1-21 Higashi-shimbashi
[Phone]                         03-6215-6322
[Fax]                           03-6215-6350

I'll try to prepare a patch for this in the coming week.

com.hk issue

Hey!
Thanks for the library.
I seems like it throws an exception even though it got a good response
image

<class 'tuple'>: (<class 'whois.parser.PywhoisError'>, PywhoisError(" \n -------------------------------------------------------------------------------\n Whois server by HKIRC\n -------------------------------------------------------------------------------\n .hk top level Domain names can be registered via HKIRC-Accredited Registrars. \n Go to https://www.hkirc.hk/content.jsp?id=280 for details. \n -------------------------------------------------------------------------------\n\n\n\nDomain Name: GOOGLE.COM.HK \n\nDomain Status: Active \n\nDNSSEC: unsigned \n\nContract Version: Refer to registrar \n\nActive variants\n\nInactive variants\n\nRegistrar Name: MARKMONITOR INC.\n\nRegistrar Contact Information: Email: [email protected]\n\nReseller: \n\n\nRegistrant Contact Information:\n\nCompany English Name (It should be the same as the registered/corporation name on your Business Register Certificate or relevant documents): HONG KONG INTERNET HOLDING LIMITED\nCompany Chinese name: \nAddress: \nCountry: Hong Kong (HK)\nEmail: Redacted for Privacy Purposes \nDomain Name Commencement Date: 14-07-2001\nExpiry Date: 20-11-2020 \nRe-registration Status: Complete \n\n\n\nAdministrative Contact Information:\n\nGiven name: \nFamily name: \nCompany name: \nAddress: \nCountry: \nPhone: \nFax: \nEmail: Redacted for Privacy Purposes \nAccount Name: \n\n\n\nTechnical Contact Information:\n\nGiven name: \nFamily name: \nCompany name: \nAddress: \nCountry: \nPhone: \nFax: \nEmail: Redacted for Privacy Purposes \n\n\n\nName Servers Information:\n\nNS1.GOOGLE.COM\nNS2.GOOGLE.COM\nNS3.GOOGLE.COM\nNS4.GOOGLE.COM\n\n\n\nStatus Information:\n\nDomain Prohibit Status: \n\nIf you have any doubt on the Registrant Contact Information, please feel free to contact the relevant registrar for details. Please accept and agree that the registrar may pass your contact information to the relevant domain contact therefore the domain contact can decide how they will follow.\n\n\n -------------------------------------------------------------------------------\n The Registry contains ONLY .com.hk, .net.hk, .edu.hk, .org.hk,\n .gov.hk, idv.hk. and .hk $domains.\n -------------------------------------------------------------------------------\n\nWHOIS Terms of Use \nBy using this WHOIS search enquiry service you agree to these terms of use.\nThe data in HKDNR's WHOIS search engine is for information purposes only and HKDNR does not guarantee the accuracy of the data. The data is provided to assist people to obtain information about the registration record of domain names registered by HKDNR. You agree to use the data for lawful purposes only.\n\nYou are not authorised to use high-volume, electronic or automated processes to access, query or harvest data from this WHOIS search enquiry service.\n\nYou agree that you will not and will not allow anyone else to:\n\na. use the data for mass unsolicited commercial advertising of any sort via any medium including telephone, email or fax; or\n\nb. enable high volume, automated or electronic processes that apply to HKDNR or its computer systems including the WHOIS search enquiry service; or\n\nc. without the prior written consent of HKDNR compile, repackage, disseminate, disclose to any third party or use the data for a purpose other than obtaining information about a domain name registration record; or\n\nd. use such data to derive an economic benefit for yourself.\n\nHKDNR in its sole discretion may terminate your access to the WHOIS search enquiry service (including, without limitation, blocking your IP address) at any time including, without limitation, for excessive use of the WHOIS search enquiry service.\n\nHKDNR may modify these terms of use at any time by publishing the modified terms of use on its website.\n\n\n\n\n\n\n"), <traceback object at 0x000002918DDD3808>)

Timeout not captured from code

Hi,

While requesting whois information I get timeout from whois server as shown in response written to screen:
"Error trying to connect to socket: closing socket"

There is nothing returned as error (No Exception) and I get an empty construct. It gets tricky to check whether it was a network error (like here) or if server returned empty on certain fields. Ideally I would expect an exception just like if name is not found (which raises an exception.

Two dates in .expiration_date

Can anyone tell me why sometimes the expiration_date brings me a list with two dates (examples: google.com, amazon.com, paypal.com, microsoft.com) and sometimes brings me only one datetime?

Thanks! :-)

Make date fields more consistent

There are some inconsistencies with date fields, sometimes the date of creation is creation_date, and sometimes created.

f.e.:

>>> whois.whois('registro.br')
{'domain': 'registro.br', ..., 'created': ['19990221', ...], ... }

I propose that all dates are moved to the format creation_date, expiration_date, etc.

Proxy usage?

Is there a way to use proxies like in requests module? If yes, plz advice about this. If no, it would be nice to have them

whois closing socket error

This basic example code doesn't work. Any thoughts?

import whois

def seeAllWHOISdata():
    w = whois.whois('reddit.com')
    print(w)

seeAllWHOISdata()

Error trying to connect to socket: closing socket
{
  "domain_name": null,
  "registrar": null,
  "whois_server": null,
  "referral_url": null,
  "updated_date": null,
  "creation_date": null,
  "expiration_date": null,
  "name_servers": null,
  "status": null,
  "emails": null,
  "dnssec": null,
  "name": null,
  "org": null,
  "address": null,
  "city": null,
  "state": null,
  "zipcode": null,
  "country": null
}

Make field names more uniform

In order to get more consistent results in pywhois, it would be good to have more uniform field names.

The main thing is registrant_name vs registrant.
Some TLDs parse the Registrant Name as registrant and some parse it as registrant_name. Instead of having two field names, which do we want to use? I'd like to edit all regex fields so they're consistent.

Allow passing of additional TLDs to suffix list

Hi there, great parser, I've found it very reliable.

That said, I've run into issues when whois'ing *.nsw.gov.au domains as they have been removed from the public suffix list.

Expected:

>>> whois.whois("transport.nsw.gov.au")
{'domain_name': 'TRANSPORT.NSW.GOV.AU', 'updated_date': datetime.datetime(2021, 2, 17, 0, 1, 27), 'registrar': 'Digital Transformation Agency – NSW', 'status': 'ok https://afilias.com.au/get-au/whois-status-codes#ok', 'registrant_name': 'Transport for NSW', 'registrant_contact_name': 'Steve Stamatellis', 'name_servers': ['NS2.TRANSPORT.NSW.GOV.AU', 'DNS1.OPTUS.NET.AU', 'NS1.TRANSPORT.NSW.GOV.AU', 'DNS0.OPTUS.NET.AU']}

Actual

>>> whois.whois("transport.nsw.gov.au")
{'domain_name': 'NSW.GOV.AU', 'updated_date': datetime.datetime(2021, 1, 4, 21, 51, 52), 'registrar': 'Afilias Australia Pty Ltd', 'status': ['serverDeleteProhibited https://afilias.com.au/get-au/whois-status-codes#serverDeleteProhibited', 'serverTransferProhibited https://afilias.com.au/get-au/whois-status-codes#serverTransferProhibited', 'serverUpdateProhibited https://afilias.com.au/get-au/whois-status-codes#serverUpdateProhibited'], 'registrant_name': 'Government of New South Wales - NSW Department of Customer Service', 'registrant_contact_name': 'NSW Government Domain Administrator', 'name_servers': ['T.AU', 'R.AU', 'Q.AU', 'S.AU']}

You'll find in the public suffix list .dat file it has been removed:

// nsw.gov.au  Bug 547985 - Removed at request of <[email protected]>

The [tldextract](https://pypi.org/project/tldextract/) pip module allows for this by providing an extra_suffixes parameter:

>>> tldextract.extract("transport.nsw.gov.au")
ExtractResult(subdomain='transport', domain='nsw', suffix='gov.au')
>>> patchedExtract = tldextract.TLDExtract(extra_suffixes=["nsw.gov.au"])
>>> patchedExtract("transport.nsw.gov.au")
ExtractResult(subdomain='', domain='transport', suffix='nsw.gov.au')

Is it possible to implement such an option?

Many thanks!

Cannot be used with pyinstaller

When a project that uses whois is converted to an executable, the resulting exe fails because the data/public_suffix_list.dat file cannot be found.

I have two suggestions: either add something to the "readme" telling folks how to add that file to the directives file, or just convert that file to a Python module and include it in the source instead of the data file.

Inconsistent results for google.com

Hello,

I've noticed issues when doing a whois for a Google IP. Here's an example of two subsequent queries from a python3 shell. First one has two domain names, one is lower one is caps, and shows the org of Google. The second one has just caps and the org is None. Not sure if this is an issue with pywhois or with the registrar. Thoughts?

>>> whois.whois('209.85.222.199')
{'domain_name': ['GOOGLE.COM', 'google.com'], 'registrar': 'MarkMonitor, Inc.', 'whois_server': 'whois.markmonitor.com', 'referral_url': None, 'updated_date': [datetime.datetime(2018, 2, 21, 18, 36, 40), datetime.datetime(2018, 2, 21, 10, 45, 7, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=57600)))], 'creation_date': [datetime.datetime(1997, 9, 15, 4, 0), datetime.datetime(1997, 9, 15, 0, 0, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=61200)))], 'expiration_date': [datetime.datetime(2020, 9, 14, 4, 0), datetime.datetime(2020, 9, 13, 21, 0, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=61200)))], 'name_servers': ['NS1.GOOGLE.COM', 'NS2.GOOGLE.COM', 'NS3.GOOGLE.COM', 'NS4.GOOGLE.COM', 'ns4.google.com', 'ns1.google.com', 'ns3.google.com', 'ns2.google.com'], 'status': ['clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited', 'clientTransferProhibited https://icann.org/epp#clientTransferProhibited', 'clientUpdateProhibited https://icann.org/epp#clientUpdateProhibited', 'serverDeleteProhibited https://icann.org/epp#serverDeleteProhibited', 'serverTransferProhibited https://icann.org/epp#serverTransferProhibited', 'serverUpdateProhibited https://icann.org/epp#serverUpdateProhibited', 'clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)', 'clientTransferProhibited (https://www.icann.org/epp#clientTransferProhibited)', 'clientDeleteProhibited (https://www.icann.org/epp#clientDeleteProhibited)', 'serverUpdateProhibited (https://www.icann.org/epp#serverUpdateProhibited)', 'serverTransferProhibited (https://www.icann.org/epp#serverTransferProhibited)', 'serverDeleteProhibited (https://www.icann.org/epp#serverDeleteProhibited)'], 'emails': ['[email protected]', '[email protected]'], 'dnssec': 'unsigned', 'name': None, 'org': 'Google LLC', 'address': None, 'city': None, 'state': 'CA', 'zipcode': None, 'country': 'US'}
>>> whois.whois('209.85.222.199')
{'domain_name': 'GOOGLE.COM', 'registrar': 'MarkMonitor Inc.', 'whois_server': 'whois.markmonitor.com', 'referral_url': None, 'updated_date': datetime.datetime(2018, 2, 21, 18, 36, 40), 'creation_date': datetime.datetime(1997, 9, 15, 4, 0), 'expiration_date': datetime.datetime(2020, 9, 14, 4, 0), 'name_servers': ['NS1.GOOGLE.COM', 'NS2.GOOGLE.COM', 'NS3.GOOGLE.COM', 'NS4.GOOGLE.COM'], 'status': ['clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited', 'clientTransferProhibited https://icann.org/epp#clientTransferProhibited', 'clientUpdateProhibited https://icann.org/epp#clientUpdateProhibited', 'serverDeleteProhibited https://icann.org/epp#serverDeleteProhibited', 'serverTransferProhibited https://icann.org/epp#serverTransferProhibited', 'serverUpdateProhibited https://icann.org/epp#serverUpdateProhibited'], 'emails': '[email protected]', 'dnssec': 'unsigned', 'name': None, 'org': None, 'address': None, 'city': None, 'state': None, 'zipcode': None, 'country': None}

inconsistency across TLDs with domain_status vs status

Hi there,

thanks for this lib, parsing whois is quite the horrible task to achieve :(

I'm wondering why only a handful of TLDs are getting a domain_status field in their results while all others have a status field? This makes checking for this value inconsistent depending on the TLD being asked.

wouldn't it be better if TLDs that currently have a domain_status field get a one-item aray as a value for status instead?

Timeout not configurable

Hi,

when I run the who.is query I can't specificy how long to wait before timeout from the server. Turkey who.is servers seem to be slow on responding. Can we add this parameter as parameter to the function?

Submit patch for .bz

Support for .bz (belize) can be obtained by configuring the host:
BZ_HOST = "whois.afilias-grs.info"

And the parser:
elif domain.endswith('.bz'):
return WhoisJobs(domain, text)

Tried to push into a new branch but don't have the access rights

add support for many TLDs

no information about the date of domain registration:
md
to
at
jp
com.cy
lv
im
vn
goog
gov
de
co.il
рф

Distinguish between not found and other errors

As an enhancement it would be nice if certain error cases are distinguished, for example if a domain is not found it gives another Error than when the rate limit exceeded.
Now it just raises a generic PywhoisError

EU parser

I think the regex for extract the registrar in .eu domain is not correct.

~$ whois -h whois.eu emanuelesacco.eu

Domain: emanuelesacco.eu
Script: LATIN

Registrant:
        NOT DISCLOSED!
        Visit www.eurid.eu for webbased WHOIS.

Technical:
        Organisation: Register S.p.A.
        Language: it
        Email: [email protected]

Registrar:
        Name: Register S.p.A.
        Website: https://www.register.it/

Name servers:
        ns1.register.it
        ns2.register.it

Please visit www.eurid.eu for more info.

The actual regex is
'registrar': r'Registrar: *Name: *([^\n\r]+)',
but I think it would be
'registrar': r'Registrar: *[\n\r]+\s*Name: *([^\n\r]+)',

.nl domains show limited info

As the title says. The .nl domains show very limited to none information about the domain.
For example, my creation date returns empty or none values, but when it's used on .com (for example) it works completely fine.

.tr domain UTF-8 charachter error

I'm trying to get a .tr domain whois data, api returns it, but Turkish characters do not appear. Where there are Turkish characters, the question mark appears in the diamond (�).

For example, I am trying to get whois data for "mimotto.com.tr". Result:

{'domain_name': 'mimotto.com.tr', 'creation_date': datetime.datetime(2017, 12, 27, 0, 0), 'expiration_date': datetime.datetime(2020, 12, 26, 0, 0), 'name_servers': 'ns1.interkeyservertr.com\nns2.interkeyservertr.com', 'registrant': 'TAL�H TASARIM DEKORASYON VE T�C.LTD.�T�.\n Ye�iloba Mahallesi 46209 Sokak No:1E/1\n Fuat Ya�murca Sanayi Sitesi / Seyhan\n Adana,\n T�rkiye\n [email protected]\n + 90-322-5030642-\n +', 'admin': None, 'admin_organization': None, 'admin_address': None, 'admin_phone': None, 'admin_fax': None, 'tech': None, 'tech_organization': None, 'tech_address': None, 'tech_phone': None, 'tech_fax': None, 'billing': None, 'billing_organization': None, 'billing_address': None, 'billing_phone': None, 'billing_fax': None}

Can't parse dates on .rs domains

sample code:

import whois
domain = whois.whois("domain.rs")
print(domain.creation_date)

The above code will print 'None', even though the dates are visible in the whois.

Sample of whois response:

Registration date: 01.08.2020 13:11:08
Modification date: 13.09.2021 10:05:55
Expiration date: 01.08.2022 17:19:38

I did fix it locally by addinbg a class WhoisRS to use the proper regex, but would be nice to implement in future update

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.