简体   繁体   English

用套接字解析wss协议的主机名?

[英]Resolve host name of wss protocol with socket?

How can resolve a host name with wss protocol by socket? 如何用socket解析wss协议的主机名?
I tried this but failed: 我尝试过但失败了:
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ip = socket.gethostbyname('wss://domain.tld')

wss://domain.tld is not a hostname, it's a URL. wss://domain.tld不是主机名,而是URL。 You can't resolve a URL with a socket, you have to parse it as a URL to get the hostname out of it, and then you can resolve that. 您无法使用套接字解析URL,您必须将其解析为URL以从中获取主机名,然后您可以解决该问题。 It doesn't matter whether the scheme is wss, http, or rsync; 该方案是wss,http还是rsync并不重要; any scheme that has a netloc field will work the same way. 任何具有netloc字段的方案都将以相同的方式工作。

For example, using urllib.parse : 例如,使用urllib.parse

>>> from urllib.parse import urlparse # in 2.x it's from urlparse
>>> url = 'wss://domain.tld'
>>> bits = urllib.parse.urlparse(url)
>>> netloc = bits.netloc
>>> netloc
'domain.tld'

So we're done, right? 所以我们完成了,对吧?

Nope. 不。 A netloc can be either a host, or a host:port. netloc可以是主机,也可以是host:端口。 And you can't just split(':') , because IPv6 addresses can have colons in them—but only if they're enclosed in brackets. 而且你不能只split(':') ,因为IPv6地址中可以包含冒号 - 但只有它们被括在括号中。 So, to get the host part of a netloc , you need to do something like this: 因此,要获取netlochost部分,您需要执行以下操作:

>>> host, _, port = netloc.rpartition(':')
>>> if ']' in port: host = netloc
>>> host
'domain.tld'

And now we are done, we've got a hostname or IP address, which we can pass to socket.gethostbyname . 现在,我们做了,我们已经有了一个主机名或IP地址,我们可以通过socket.gethostbyname

But a couple notes on that. 但有几个注意到这一点。

First, you don't need to create a socket.socket object to call gethostbyname ; 首先,您不需要创建一个socket.socket对象来调用gethostbyname ; it's a top-level function on the module, that doesn't need any socket objects. 它是模块的顶级函数,不需要任何套接字对象。

Second, gethostbyname doesn't work on IPv6, and has some limitations even on IPv4, so you may want to use getaddrinfo instead. 其次, gethostbyname不适用于IPv6,甚至在IPv4上也有一些限制,因此您可能希望使用getaddrinfo

So, finishing up: 所以,完成:

>>> import socket
>>> addresses = socket.getaddrinfo(host, None) # or host, port if you prefer
gaierror: [Errno 8] nodename nor servname provided, or not known

Well, that's to be expected, since our hostname is domain.tld , and there's no such domain. 嗯,这是预料之中的,因为我们的主机名是domain.tld ,并且没有这样的域名。 But if we used, say, www.google.com , we'd get back a nice list of a couple dozen IPv4 addresses and, if your system has IPv6 connectivity, a couple of IPv6 ones as well. 但是,如果我们使用www.google.com ,我们会收到一个很好的列表,其中包含几十个IPv4地址,如果您的系统具有IPv6连接,那么也会有几个IPv6地址。 You can just use the first one, or prefer IPv4 to IPv6 or vice-versa, or discriminate on some other field. 您可以只使用第一个,或者更喜欢IPv4到IPv6,反之亦然,或者在某个其他字段上进行区分。 (You can also filter on various fields in the first place by passing more arguments to getaddrinfo .) (您还可以通过向getaddrinfo传递更多参数来首先过滤各种字段。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM