为什么这两个字符串不等于？

Question

I am sending a packet through UDP and for some reason I can't compare the string I extract from the packet and the string I create even though the values are the same when I print them (no trailing white spaces). 我正在通过UDP发送数据包，由于某种原因，我无法比较我从数据包中提取的字符串和我创建的字符串，即使我打印它们时值相同（没有尾随空格）。

byte[] incoming = new byte[1000];
DatagramPacket request = new DatagramPacket(incoming, incoming.length);
serverSocket.receive(request);
String str = new String(request.getData());
String str2 = new String("message received");

if(str.equals(str2))
{
   System.out.println("equal");
}

Is there any reason for this? 这有什么理由吗？

Answer 1

This occurs because new String(request.getData()) does not return "message received" . 这是因为new String(request.getData()) 不返回"message received" 。

The problem is [likely] due to the fact that new String(byte[]) attempts to use all (1000 of) the bytes supplied, in the default encoding, which ends with a bunch of NUL ('\\0') characters that append to the actual string content making it not equal with the literal. 问题是[可能]，因为new String(byte[])尝试使用默认编码中提供的所有（1000个）字节，这些字符以一堆NUL（'\\ 0'）字符结尾附加到实际的字符串内容，使其与文字不相等 。 Such can be easily seen a debugger, although such NUL characters are often "lost" when displaying as normal text as with println . 这样可以很容易地看作调试器，尽管这些NUL字符在与println一样显示为普通文本时经常“丢失”。

Trivially: "hello".equals("hello\\0") is false. "hello".equals("hello\\0") ： "hello".equals("hello\\0")是假的。

Several solutions include: 几种解决方案包括

Frame the string, such as prefixing the sent data with the number of bytes that make up the string, and then using a String constructor that takes a limit/length or; 帧串，如与构成该字符串的字节数前缀所发送的数据，并且然后使用一个字符串构造函数限制/长度或;
Prevent any trailing 0 from being processed, again by specifiying the limit to decode or; 通过指定解码限制来防止处理任何尾随0，或者;
Remove any NUL characters after decoding the data. 解码数据后删除所有NUL字符。

Since option #3 is easy ¹ (until it can be fixed to use #1/#2), consider: 由于选项＃3很简单¹ （直到它可以修复为使用＃1 /＃2），考虑：

String str = new String(request.getData(), "UTF-8"); // Specify an encoding!
int nul = str.indexOf('\0');
if (nul > -1) {
   str = str.substring(0, nul);
}

¹ While trimming is the easiest, it is not generally appropriate. ¹虽然修整是最简单的，它不是通常合适的。 The biggest problem with #3 over #2 is it first decodes all the bytes and then filters the characters. ＃3优于＃2的最大问题是它首先解码所有字节然后过滤字符。 Under different encodings (although ASCII and UTF-8 should be "safe"), this may result in non-NUL garbage after the actual string content depending upon what exists in the buffer. 在不同的编码下（尽管ASCII和UTF-8应该是“安全的”），这可能导致实际字符串内容之后的非NUL垃圾，具体取决于缓冲区中存在的内容。

Also, specify an encoding manually to new String(byte[] ..) or String.getBytes(..) . 另外，手动为new String(byte[] ..)或String.getBytes(..)指定编码。 Otherwise the "default encoding" will be used, which can cause problems if the different systems are using a different default. 否则将使用“默认编码”，如果不同的系统使用不同的默认值，则可能导致问题。

为什么这两个字符串不等于？

问题描述

1 个解决方案

解决方案1
3 已采纳 2014-09-27 00:39:22

为什么这两个字符串不等于？

问题描述

1 个解决方案

解决方案1 3 已采纳 2014-09-27 00:39:22

解决方案1
3 已采纳 2014-09-27 00:39:22