为什么python socket.recvfrom获得的字节流与WireShark检索的字节流不同？

Question

I used the python socket to send a DNS query packet socket and listen to the response. 我使用python套接字发送DNS查询数据包套接字并监听响应。 Finally, I got a DNS response packet by the socket.recvfrom(2048) function as expected. 最后，我按预期通过socket.recvfrom(2048)函数获得了DNS响应数据包。 But strangely, where I compared the response packet with the packet crawled by Wireshark, I found there exists many difference. 但是奇怪的是，在将响应数据包与Wireshark爬网的数据包进行比较时，我发现存在很多差异。

The differences would be found as 3f at the second picture. 在第二张图片中将发现差异为3f 。

The DNS response packet (The highlighted part) crawled by the Wireshark Wireshark爬行的DNS响应数据包（突出显示的部分）

The DNS response packet got by the socket.recvfrom(2048) 由socket.recvfrom(2048)的DNS响应数据包

The Creating a Socket Part Codes: 创建套接字零件代码：

    ipv = check_ip(dst)
    udp = socket.getprotobyname(Proto.UDP)
    if ipv == IPV.ERROR:
        return None
    elif ipv == IPV.IPV4:
        return socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
    elif ipv == IPV.IPV6:
        return socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, udp)
    else:
        return None

The Receiving a DNS response packet Part Codes: 接收DNS响应数据包的部分代码：

    remained_time = 0
    while True:
        remained_time = self.timeout - timeit.default_timer() + sent_time
        readable = select.select([sock], [], [], remained_time)[0]
        if len(readable) == 0:
            return (-1, None)

        packet, addr = sock.recvfrom(4096)

Answer 1

Byte 0x3F is the ASCII '?' 字节0x3F是ASCII '?' character. 字符。 That commonly means the data is being treated as text and is passing through a charset conversion that doesn't support the bytes being converted. 通常，这意味着数据被视为文本，并且正在通过不支持要转换的字节的字符集转换。

Notice that 0x3F is replacing only the bytes that are > 0x7F (the last byte supported by ASCII). 请注意， 0x3F仅替换> 0x7F （ASCII支持的最后一个字节）的字节。 Non-ASCII bytes in the range of 0x80-0xFF are subject to charset interpretation. 0x80-0xFF范围内的非ASCII字节受字符集解释。

That makes sense, as you are using the version of recvfrom() that returns a string , so the received bytes need to be converted to Python's default string encoding. 这很有意义，因为您正在使用返回string的recvfrom()版本，因此需要将接收到的字节转换为Python的默认string编码。

Since you need raw bytes instead, use recvfrom_into() to fill a pre-allocated bytearray , eg: 由于您需要原始字节，因此使用recvfrom_into()来填充预分配的字节bytearray ，例如：

packet = bytearray(4096)
remained_time = 0
while True:
    remained_time = self.timeout - timeit.default_timer() + sent_time
    readable = select.select([sock], [], [], remained_time)[0]
    if len(readable) == 0:
        return (-1, None)
    nbytes, addr = sock.recvfrom_into(packet)

Then you can use packet up to nbytes number of bytes as needed. 然后，您可以根据需要使用最多nbytes个字节的packet 。

为什么python socket.recvfrom获得的字节流与WireShark检索的字节流不同？

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-09-12 07:05:50

为什么python socket.recvfrom获得的字节流与WireShark检索的字节流不同？

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-09-12 07:05:50

解决方案1
1 已采纳 2018-09-12 07:05:50