简体   繁体   English

从字节流读取时数据损坏

[英]Data Corrupted When Reading from Byte Stream

For a networking project, I'm using UDP Multicast to build an overlay network with my own implementation of IP. 对于一个网络项目,我正在使用UDP组播以我自己的IP实现来构建覆盖网络。

I use the following to parse and build my Header first, then append the payload: 我使用以下代码首先解析和构建Header,然后附加有效负载:

def __init__(buffer_size_bytes):
    self.__buffer = bytearray(buffer_size_bytes)

def read_sock(self, listening_socket):
    n_bytes, addr = listening_socket.recvfrom_into(self.__buffer, Packet.HEADER_SIZE)
    packet = Packet.parse_header(self.__buffer)

    if packet.payload_length is not 0:
        packet.payload = parse_payload(packet.payload_length, listening_socket)

    self.__router.add_to_route_queue(packet, listening_socket.locator)

def parse_payload(to_read, socket):
    payload = bytearray(to_read)
    view = memoryview(payload)
    while to_read:
        n_bytes, addr = socket.recvfrom_into(view, to_read)
        view = view[n_bytes:]
        to_read -= n_bytes

    return payload

The header seems to be parsed correctly, but the payload gets corrupted every time. 标头似乎已正确解析,但是有效载荷每次都被破坏。 I can't figure out what I'm doing wrong when parsing the payload, and I can confirm I'm sending a bytearray from the other side. 在解析有效负载时,我无法弄清楚自己在做什么,并且可以确认我是从另一侧发送字节数组。

For example, when I send a packet with the payload "Hello World" encoded in utf-8, I receive the following: 例如,当我发送带有以utf-8编码的有效载荷“ Hello World”的数据包时,会收到以下消息:

b'`\x00\x00\x00\x00\x0b\x00\x1f\x00\x00\x00'

The Packet.parse_header method: Packet.parse_header方法:

def parse_header(cls, packet_bytes):
    values = struct.unpack(cls.ILNPv6_HEADER_FORMAT, packet_bytes[:cls.HEADER_SIZE])

    flow_label = values[0] & 1048575
    traffic_class = (values[0] >> 20 & 255)
    version = values[0] >> 28
    payload_length = values[1]
    next_header = values[2]
    hop_limit = values[3]
    src = (values[4], values[5])
    dest = (values[6], values[7])

    return Packet(src, dest, next_header, hop_limit, version, traffic_class, flow_label, payload_length)

For reference, the entire sent packet looks like this: 作为参考,整个发送的数据包如下所示:

b'`\x00\x00\x00\x00\x0b\x00\x1f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01Hello World'

On receiving the first packet, the socket.recvfrom_into blocks when reading for the payload, and doesn't return until I send another message. 接收到第一个数据包时,socket.recvfrom_into会在读取有效负载时阻塞,并且直到我发送另一条消息后才返回。 It then seems to discard the payload of the previous message and use the second packet received as the payload... 然后似乎丢弃了先前消息的有效负载,并使用接收到的第二个数据包作为有效负载...

Found my explanation here . 在这里找到我的解释。

So the key thing was that I'm using UDP. 所以关键是我正在使用UDP。 And UDP sockets discard anything that doesn't fit in the buffer you give it. UDP套接字会丢弃您提供的缓冲区中不适合的任何内容。

TCP sockets however behave more like the bytestream I was expecting. 但是,TCP套接字的行为更像我期望的字节流。

Fun! 有趣!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM