繁体   English   中英

字符串内容相同但equals方法返回false

[英]String contents are same but equals method returns false

我正在使用StringEscapeUtils来逃避和unescape html。 我有以下代码

import org.apache.commons.lang.StringEscapeUtils;

public class EscapeUtils {

    public static void main(String args[]) {

        String string = "    4-Spaces    ,\"Double Quote\", 'Single Quote', \\Back-Slash\\, /Forward Slash/ ";

        String escaped = StringEscapeUtils.escapeHtml(string);
        String myEscaped = escapeHtml(string);

        String unescaped = StringEscapeUtils.unescapeHtml(escaped);
        String myUnescaped = StringEscapeUtils.unescapeHtml(myEscaped);

        System.out.println("Real String: " + string);
        System.out.println();
        System.out.println("Escaped String: " + escaped);
        System.out.println("My Escaped String: " + myEscaped);
        System.out.println();
        System.out.println("Unescaped String: " + unescaped);
        System.out.println("My Unescaped String: " + myUnescaped);
        System.out.println();
        System.out.println("Comparison:");
        System.out.println("Real String == Unescaped String: " + string.equals(unescaped));
        System.out.println("Real String == My Unescaped String: " + string.equals(myUnescaped));
        System.out.println("Unescaped String == My Unescaped String: " + unescaped.equals(myUnescaped));

    }

    public static String escapeHtml(String s) {
        String escaped = "";
        if(null != s) {
            escaped = StringEscapeUtils.escapeHtml(s);
            escaped = escaped.replaceAll(" "," ");
            escaped = escaped.replaceAll("'","'");
            escaped = escaped.replaceAll("\\\\","\");
            escaped = escaped.replaceAll("/","/");
        }
        return escaped;
    }

}

输出:

Real String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 

Escaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 
My Escaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 

Unescaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 
My Unescaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 

Comparison:
Real String == Unescaped String: true
Real String == My Unescaped String: false
Unescaped String == My Unescaped String: false

escaped了真正的string ,然后没有unescaped它。 myEsceped首先使用相同的进程进行转义,然后将一些更多的html字符替换为其html代码。 myUnescaped实际上是myEscaped unescape,其内容与真实字符串相同。

输出显示实际string ,未unescapedmyUnescaped内容相同。 但是,如在比较部分中, myUnescaped不等于stringunescaped

我还不明白这里到底发生了什么。 有人能解释一下吗?

这是由于当逃脱的HTML,您要更换' ' 

public static String escapeHtml(String s) {
        String escaped = "";
        if(null != s) {
            escaped = StringEscapeUtils.escapeHtml(s);
            escaped = escaped.replaceAll(" "," "); // HERE
            escaped = escaped.replaceAll("'","'");
            escaped = escaped.replaceAll("\\\\","\");
            escaped = escaped.replaceAll("/","/");
        }
        return escaped;
    }

虽然StringEscapeUtils.escapeHtml没有转义' ' ,但下面是他们网站上的示例:

"bread" & "butter" 

"bread" & "butter"

这意味着StringEscapeUtils.escapeHtml保留空格

如果从escapeHtml删除了escaped = escaped.replaceAll(" "," "); ,未unescapedmyUnescaped比赛!

Apurv Answer之后 ,我分析了字符串的字节数组。

String:        32,  32,  32,  32,  52,  45,  83, 112,  97,  99, 101, 115,  32,  32,  32,  32,  44,  34,  68, 111, 117,  98, 108, 101,  32,  81, 117, 111, 116, 101,  34,  44,  32,  39,  83, 105, 110, 103, 108, 101,  32,  81, 117, 111, 116, 101,  39,  44,  32,  92,  66,  97,  99, 107,  45,  83, 108,  97, 115, 104,  92,  44,  32,  47,  70, 111, 114, 119,  97, 114, 100,  32,  83, 108,  97, 115, 104,  47,  32
unescaped :    32,  32,  32,  32,  52,  45,  83, 112,  97,  99, 101, 115,  32,  32,  32,  32,  44,  34,  68, 111, 117,  98, 108, 101,  32,  81, 117, 111, 116, 101,  34,  44,  32,  39,  83, 105, 110, 103, 108, 101,  32,  81, 117, 111, 116, 101,  39,  44,  32,  92,  66,  97,  99, 107,  45,  83, 108,  97, 115, 104,  92,  44,  32,  47,  70, 111, 114, 119,  97, 114, 100,  32,  83, 108,  97, 115, 104,  47,  32
myUnescaped:  -96, -96, -96, -96,  52,  45,  83, 112,  97,  99, 101, 115, -96, -96, -96, -96,  44,  34,  68, 111, 117,  98, 108, 101, -96,  81, 117, 111, 116, 101,  34,  44, -96,  39,  83, 105, 110, 103, 108, 101, -96,  81, 117, 111, 116, 101,  39,  44, -96,  92,  66,  97,  99, 107,  45,  83, 108,  97, 115, 104,  92,  44, -96,  47,  70, 111, 114, 119,  97, 114, 100, -96,  83, 108,  97, 115, 104,  47, -96

我似乎在myUnescaped ,空格已转换为ascii -96而不是32

所以我写了一个unescapeHtml方法如下。 该方法首先替换&nbsp有一个空格,然后使用StringEscapeUtils到UNESCAPE HTML。

public static String unescapeHtml(String s) {
    String unescaped = "";
    if(null != s) {
        unescaped = s.replaceAll(" ", " ");
        unescaped = StringEscapeUtils.unescapeHtml(unescaped);
    }
    return unescaped;
}

然后我使用以下代码获得了myUnescaped

String myUnescaped = unescapeHtml(myEscaped);

这给了我myUnescaped字符串等于string和未unescaped

替代我替换了    这不要求我写unescapeHtml mehod。 更新后的escapeHtml方法代码如下。

public static String escapeHtml(String s) {
    String escaped = "";
    if(null != s) {
        escaped = StringEscapeUtils.escapeHtml(s);
        escaped = escaped.replaceAll(" "," ");    //updated line 
        escaped = escaped.replaceAll("'","'");
        escaped = escaped.replaceAll("\\\\","\");
        escaped = escaped.replaceAll("/","/");
    }
    return escaped;
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM