简体   繁体   English

为什么python socket.recvfrom获得的字节流与WireShark检索的字节流不同?

[英]Why the bytes stream got by python socket.recvfrom is different from that crawled by WireShark?

I used the python socket to send a DNS query packet socket and listen to the response. 我使用python套接字发送DNS查询数据包套接字并监听响应。 Finally, I got a DNS response packet by the socket.recvfrom(2048) function as expected. 最后,我按预期通过socket.recvfrom(2048)函数获得了DNS响应数据包。 But strangely, where I compared the response packet with the packet crawled by Wireshark, I found there exists many difference. 但是奇怪的是,在将响应数据包与Wireshark爬网的数据包进行比较时,我发现存在很多差异。

The differences would be found as 3f at the second picture. 在第二张图片中将发现差异为3f

The DNS response packet (The highlighted part) crawled by the Wireshark Wireshark爬行的DNS响应数据包(突出显示的部分)

The DNS response packet got by the socket.recvfrom(2048) socket.recvfrom(2048)的DNS响应数据包

The Creating a Socket Part Codes: 创建套接字零件代码:

    ipv = check_ip(dst)
    udp = socket.getprotobyname(Proto.UDP)
    if ipv == IPV.ERROR:
        return None
    elif ipv == IPV.IPV4:
        return socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
    elif ipv == IPV.IPV6:
        return socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, udp)
    else:
        return None

The Receiving a DNS response packet Part Codes: 接收DNS响应数据包的部分代码:

    remained_time = 0
    while True:
        remained_time = self.timeout - timeit.default_timer() + sent_time
        readable = select.select([sock], [], [], remained_time)[0]
        if len(readable) == 0:
            return (-1, None)

        packet, addr = sock.recvfrom(4096)

Byte 0x3F is the ASCII '?' 字节0x3F是ASCII '?' character. 字符。 That commonly means the data is being treated as text and is passing through a charset conversion that doesn't support the bytes being converted. 通常,这意味着数据被视为文本,并且正在通过不支持要转换的字节的字符集转换。

Notice that 0x3F is replacing only the bytes that are > 0x7F (the last byte supported by ASCII). 请注意, 0x3F仅替换> 0x7F (ASCII支持的最后一个字节)的字节。 Non-ASCII bytes in the range of 0x80-0xFF are subject to charset interpretation. 0x80-0xFF范围内的非ASCII字节受字符集解释。

That makes sense, as you are using the version of recvfrom() that returns a string , so the received bytes need to be converted to Python's default string encoding. 这很有意义,因为您正在使用返回stringrecvfrom()版本,因此需要将接收到的字节转换为Python的默认string编码。

Since you need raw bytes instead, use recvfrom_into() to fill a pre-allocated bytearray , eg: 由于您需要原始字节,因此使用recvfrom_into()来填充预分配的字节bytearray ,例如:

packet = bytearray(4096)
remained_time = 0
while True:
    remained_time = self.timeout - timeit.default_timer() + sent_time
    readable = select.select([sock], [], [], remained_time)[0]
    if len(readable) == 0:
        return (-1, None)
    nbytes, addr = sock.recvfrom_into(packet)

Then you can use packet up to nbytes number of bytes as needed. 然后,您可以根据需要使用最多nbytes个字节的packet

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM