简体   繁体   English

为什么这两个字符串不等于?

[英]Why aren't these two string equals?

I am sending a packet through UDP and for some reason I can't compare the string I extract from the packet and the string I create even though the values are the same when I print them (no trailing white spaces). 我正在通过UDP发送数据包,由于某种原因,我无法比较我从数据包中提取的字符串和我创建的字符串,即使我打印它们时值相同(没有尾随空格)。

byte[] incoming = new byte[1000];
DatagramPacket request = new DatagramPacket(incoming, incoming.length);
serverSocket.receive(request);
String str = new String(request.getData());
String str2 = new String("message received");

if(str.equals(str2))
{
   System.out.println("equal");
}

Is there any reason for this? 这有什么理由吗?

This occurs because new String(request.getData()) does not return "message received" . 这是因为new String(request.getData()) 返回"message received"

The problem is [likely] due to the fact that new String(byte[]) attempts to use all (1000 of) the bytes supplied, in the default encoding, which ends with a bunch of NUL ('\\0') characters that append to the actual string content making it not equal with the literal. 问题是[可能],因为new String(byte[])尝试使用默认编码中提供的所有 (1000个)字节,这些字符以一堆NUL('\\ 0')字符结尾附加到实际的字符串内容,使其与文字不相等 Such can be easily seen a debugger, although such NUL characters are often "lost" when displaying as normal text as with println . 这样可以很容易地看作调试器,尽管这些NUL字符在与println一样显示为普通文本时经常“丢失”。

Trivially: "hello".equals("hello\\0") is false. "hello".equals("hello\\0")"hello".equals("hello\\0")是假的。

Several solutions include: 几种解决方案包括

  1. Frame the string, such as prefixing the sent data with the number of bytes that make up the string, and then using a String constructor that takes a limit/length or; 串,如与构成该字符串的字节数前缀所发送的数据,并且然后使用一个字符串构造函数限制/长度或;

  2. Prevent any trailing 0 from being processed, again by specifiying the limit to decode or; 通过指定解码限制来防止处理任何尾随0,或者;

  3. Remove any NUL characters after decoding the data. 解码数据后删除所有NUL字符。

Since option #3 is easy 1 (until it can be fixed to use #1/#2), consider: 由于选项#3很简单1 (直到它可以修复为使用#1 /#2),考虑:

String str = new String(request.getData(), "UTF-8"); // Specify an encoding!
int nul = str.indexOf('\0');
if (nul > -1) {
   str = str.substring(0, nul);
}

1 While trimming is the easiest, it is not generally appropriate. 1虽然修整是最简单的,它不是通常合适的。 The biggest problem with #3 over #2 is it first decodes all the bytes and then filters the characters. #3优于#2的最大问题是它首先解码所有字节然后过滤字符。 Under different encodings (although ASCII and UTF-8 should be "safe"), this may result in non-NUL garbage after the actual string content depending upon what exists in the buffer. 在不同的编码下(尽管ASCII和UTF-8应该是“安全的”),这可能导致实际字符串内容之后的非NUL垃圾,具体取决于缓冲区中存在的内容。

Also, specify an encoding manually to new String(byte[] ..) or String.getBytes(..) . 另外,手动为new String(byte[] ..)String.getBytes(..)指定编码。 Otherwise the "default encoding" will be used, which can cause problems if the different systems are using a different default. 否则将使用“默认编码”,如果不同的系统使用不同的默认值,则可能导致问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM