简体   繁体   English

如何从转义字符中排除转义字符

[英]How to exclude an escape character from being treated as an escape character

I have a Java string 我有一个Java字符串

String t = "Region S\u00FCdost SER";

where \ü is a replacement for the unicode character "ü" 其中\\ u00FC代替了Unicode字符“ü”

If i add a new escape char to the above string, i would still want my below function to escape other chars excluding the current . 如果我在上面的字符串中添加了一个新的转义字符,我仍然希望我的below函数转义除当前字符以外的其他字符。

For example, the below function on re running would return the result as "Region S\\\üdost SER" and "Region S\\\\\\\üdost SER" on subsequent iterations. 例如,下面的重新运行函数将在后续迭代中将结果返回为“ Region S \\\\ u00FCdost SER”和“ Region S \\\\\\ u00FCdost SER”。

How do we prevent this? 我们如何防止这种情况?

public static String escapeString(String str)
    {
        StringBuffer result = new StringBuffer();

        // char is 16 bits long and can hold an UTF-16 code        
        // i iterate on chars and not on code points
        // i guess this will be enough until we need to support surrogate pairs 
        for (int i = 0; i < str.length(); i++)
        {
            char c = str.charAt(i);
            switch (c) {

            case '"':
                result.append("\\\""); //$NON-NLS-1$
                break;
            case '\b':
                result.append("\\b"); //$NON-NLS-1$
                break;
            case '\t':
                result.append("\\t"); //$NON-NLS-1$
                break;
            case '\n':
                result.append("\\n"); //$NON-NLS-1$
                break;
            case '\f':
                result.append("\\f"); //$NON-NLS-1$
                break;
            case '\r':
                result.append("\\r"); //$NON-NLS-1$
                break;
            case '\'':
                   result.append("\\'"); //$NON-NLS-1$   

                break;
            case '\\':

                result.append("\\\\"); //$NON-NLS-1$

                break;

            default:
                if (c < 128)
                {
                    //is ascii
                    result.append(c);
                }
                else
                {
                    result.append(
                            String.format("\\u%04X", (int) c)); //$NON-NLS-1$
                }
            }
        }

        return result.toString();
    }
}

You can do: 你可以做:

case '\\':
    if(str.charAt(i+1)!='u')
        result.append("\\\\");
    else 
        result.append("\\");
break;

Assuming that \\u\u003c/code> will always denote a unicode character sequence in your string. 假设\\u\u003c/code>将始终表示您字符串中的unicode字符序列。

When you write a Java string literal as "Region S\üdost SER" , the Java compiler will interpret that as the string value Region Südost SER , which is what the escape() method will see when called on t . 当您将Java字符串文字写为"Region S\üdost SER" ,Java编译器会将其解释为字符串值Region Südost SER ,这是在t调用时escape()方法将看到的内容。

If you wanted the string Region S\üdost SER , you should have escaped the \\ , ie "Region S\\\üdost SER" . 如果要使用字符串Region S\üdost SER ,则应该转义\\,即"Region S\\\üdost SER"

If you keep running the escape() method, I believe you'll see what you want. 如果您继续运行escape()方法,相信您会看到所需的内容。

String s = "Region S\u00FCdost SER";
System.out.println(s); // print original text
for (int i = 0; i < 4; i++) {
    s = escapeString(s);
    System.out.println(s);
}

Output: 输出:

Region Südost SER                           <-- original text
Region S\u00FCdost SER
Region S\\u00FCdost SER
Region S\\\\u00FCdost SER
Region S\\\\\\\\u00FCdost SER

If you change input to "He'd say: \\"Bitte schön\\"" , you get: 如果将输入更改为"He'd say: \\"Bitte schön\\"" ,则会得到:

He'd say: "Bitte schön"                     <-- original text
He\'d say: \"Bitte sch\u00F6n\"
He\\\'d say: \\\"Bitte sch\\u00F6n\\\"
He\\\\\\\'d say: \\\\\\\"Bitte sch\\\\u00F6n\\\\\\\"
He\\\\\\\\\\\\\\\'d say: \\\\\\\\\\\\\\\"Bitte sch\\\\\\\\u00F6n\\\\\\\\\\\\\\\"

I mean, this is what you wanted, right? 我的意思是,这就是您想要的,对吗? If not, please clarify question by actually showing example output of what you want. 如果不是,请通过实际显示所需内容的示例输出来澄清问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM