简体繁体 English

自定义协议的应用程序数据是否需要校验和？

[英]Is a checksum needed for application data of a custom protocol?

原文 2016-11-24 13:56:01 2 2 networking/ network-programming/ protocols/ network-protocols/ data-integrity

Since packets travel over the wire have checksums on different layers, ~~Ethernet and~~ IPv4 have checksums for their headers, TCP's checksum even covers the entire segment. 由于通过网络传输的数据包在不同的层具有校验和，因此~~以太网和~~ IPv4的标头具有校验和，因此TCP的校验和甚至覆盖了整个段。

I know it is not impossible that a corrupted packet, from the standpoint of the application layer, can slip in without being discarded by Ethernet/IP/TCP, because there are chances that their checksums are correct, only the probability is low. 我知道，从应用程序层的角度来看，损坏的数据包可以滑入而不会被以太网/ IP / TCP丢弃是不可能的，因为它们的校验和有可能是正确的，只有概率很小。

I am designing a custom binary protocol for an IM application. 我正在为IM应用程序设计自定义二进制协议。 My question is do I need to add a checksum to ensure the integrity of my application data? 我的问题是我需要添加校验和以确保我的应用程序数据的完整性吗？ Is a checksum really needed in practice ? 在实践中真的需要校验和吗？

2 个解决方案

There's actual research on this subject. 在这个问题上有实际的研究。 It's old, but very relevant to the question at hand. 它很旧，但是与手头的问题非常相关。

The paper, from 2000, is called "When the CRC and TCP checksum disagree" by Jonathan Stone and Craig Partridge, which investigate packet and frame errors, and look how often the TCP checksum is wrong, but the Ethernet CRC is fine. 乔纳森·斯通和克雷格·帕特里奇（Jonathan Stone）和克雷格·帕特里奇（Craig Partridge）于2000年发表的论文被称为“当CRC和TCP校验和不一致时”，该论文调查数据包和帧错误，并查看TCP校验和有多错误，但以太网CRC很好。 You can find the PDF here . 您可以在此处找到PDF。 Here are the important bits. 这是重要的位。

From the abstract: 从摘要：

Traces of Internet packets from the past two years show that between 1 packet in 1,100 and 1 packet in 32,000 fails the TCP checksum, even on links where link-level CRCs should catch all but 1 in 4 billion errors. 过去两年中对Internet数据包的跟踪显示，即使在链路级CRC应该捕获除40亿个错误之外的所有错误的链接上，1100个数据包中的1个数据包和32,000个数据包中的1个数据包也无法通过TCP校验和。

From the conclusion (with some of my highlighting) 从结论（我的一些重点）

In practice, the checksum is being asked to detect an error every few thousand packets . 实际上，要求校验和每几千个数据包检测一次错误。 After eliminating those errors that the checksum always catches, the data suggests that, on average, between one packet in 10 billion and one packet in a few millions will have an error that goes undetected . 在消除了校验和总是捕获的那些错误之后，数据表明，平均而言， 在100亿个数据包中和数百万个数据包中的一个数据包之间将有一个未被发现的错误 。 The exact range depends on the type of data transferred and the path being traversed. 确切范围取决于传输的数据类型和所遍历的路径。 While these odds seem large, they do not encourage complacency. 尽管这些可能性似乎很大，但它们并不鼓励自满。 In every trace, one or two 'bad apple' hosts or paths are responsible for a huge proportion of the errors. 在每条痕迹中，一两个“坏苹果”主机或路径会导致很大比例的错误。 For applications which stumble across one of the `bad-apple' hosts, the expected time until a corrupted data is accepted could be as low as a few minutes. 对于偶然遇到“坏苹果”主机之一的应用程序，直到接收到损坏的数据的预期时间可能只有几分钟。 When compared to undetected error rates for local I/O (eg, disk drives), these rates are disturbing. 与未检测到的本地I / O（例如磁盘驱动器）错误率相比，这些错误率令人不安。 Our conclusion is that vital applications should strongly consider augmenting the TCP checksum with an application sum . 我们的结论是，至关重要的应用程序应强烈考虑使用应用程序总和来扩展TCP校验和 。

I don't know of any newer research into that question (enlighten me if you know otherwise!), so the Internet could have become more reliable since then, and the numbers in the paper might be irrelevant. 我不知道对此问题有任何新的研究（如果您不知道，请告诉我！），因此，从那时起，Internet可能会变得更加可靠，并且本文中的数字可能无关紧要。

However, and this is important , 17 years have passed, and the amount of Internet traffic simply exploded since that paper was written. 但是， 这一点很重要 ，已经过去了17年，自撰写该论文以来，Internet流量就激增了。 At 1Gbps, which is not an uncommon connection speed nowadays, you're sending about 81K full TCP segments, with 1460 bytes of data, per second (or a lot more if the packets are smaller). 在1Gbps的，这是不是一种罕见的连接速度的今天，你发送约81K完整的TCP段，1460个字节的数据，每秒（或更多，如果数据包是小）。 That's a million big packets every 12.5 seconds, a billion in about 3.5 hours (or again, a lot more if the packets are small). 那是每12.5秒发送一百万个大数据包，大约在3.5小时内返回十亿个（如果数据包很小，则又增加了十亿个）。

So to answer your question - that depends. 因此，请回答您的问题-这取决于。 For transferring large files or other data, I'd definitely add additional checks if the data itself isn't protected in any way. 对于传输大文件或其他数据，如果数据本身未受到任何保护，我肯定会添加其他检查。 For messaging, which pushes very little data into the network, you'll probably be fine with TCP's checksum, with maybe some sanity checks on the input you're getting to make sure that it's in the correct format, and various parameters and fields make sense. 对于将很少的数据推入网络的消息传递，TCP的校验和可能会很好，可能会对输入进行一些完整性检查，以确保其格式正确，并且各种参数和字段感。

I would not bother with a checksum because of packets getting corrupted in the network. 由于数据包在网络中被破坏，我不会打扰校验和。

However, since you are working on a protocol that would presumably be used on the open internet, you will need to prepare for rare cases of an unintended application sending udp packets or making tcp connections to your receiving/listening ports. 但是，由于您正在使用的协议可能会在开放的Internet上使用，因此您将需要为意外的应用程序发送udp数据包或与接收/侦听端口建立tcp连接的罕见情况做准备。 Also there will be maybe less port scans and hackers / script kiddies knocking on your gates. 此外，端口扫描和黑客/脚本小子敲门的次数可能会更少。

So you should make your protocol such that it is easy to discard this kind of traffic. 因此，您应该制定协议，以便轻松丢弃此类流量。 Using a checksum in every transmission would imho be one sensible way of doing that. 在每次传输中使用校验和都不是一种明智的方式。