简体   繁体   English

String.replace所有单反斜杠和双反斜杠

[英]String.replaceAll single backslashes with double backslashes

I'm trying to convert the String \\something\\ into the String \\\\something\\\\ using replaceAll , but I keep getting all kinds of errors. 我正在尝试使用replaceAllString \\something\\转换为String \\\\something\\\\ ,但我不断遇到各种错误。 I thought this was the solution: 我认为这是解决方案:

theString.replaceAll("\\", "\\\\");

But this gives the below exception: 但这给出了以下异常:

java.util.regex.PatternSyntaxException: Unexpected internal error near index 1

The String#replaceAll() interprets the argument as a regular expression . String#replaceAll()将参数解释为正则表达式 The \\ is an escape character in both String and regex . \\这两个转义字符Stringregex You need to double-escape it for regex: 您需要对正则表达式进行两次转义:

string.replaceAll("\\\\", "\\\\\\\\");

But you don't necessarily need regex for this, simply because you want an exact character-by-character replacement and you don't need patterns here. 但是您不必为此使用正则表达式,仅是因为您希望逐个字符地进行精确替换,并且这里不需要模式。 So String#replace() should suffice: 所以String#replace()应该足够了:

string.replace("\\", "\\\\");

Update : as per the comments, you appear to want to use the string in JavaScript context. 更新 :根据注释,您似乎想在JavaScript上下文中使用字符串。 You'd perhaps better use StringEscapeUtils#escapeEcmaScript() instead to cover more characters. 您最好使用StringEscapeUtils#escapeEcmaScript()来覆盖更多字符。

To avoid this sort of trouble, you can use replace (which takes a plain string) instead of replaceAll (which takes a regular expression). 为了避免这种麻烦,可以使用replace (使用纯字符串)而不是replaceAll (使用正则表达式)。 You will still need to escape backslashes, but not in the wild ways required with regular expressions. 您仍然需要转义反斜杠,但不需要用正则表达式所要求的狂放方式。

TLDR: use theString = theString.replace("\\\\", "\\\\\\\\"); TLDR:使用theString = theString.replace("\\\\", "\\\\\\\\"); instead. 代替。


Problem 问题

replaceAll(target, replacement) uses regular expression (regex) syntax for target and partially for replacement . replaceAll(target, replacement)使用正则表达式(正则表达式)语法target和部分用于replacement

Problem is that \\ is special character in regex (it can be used like \\d to represents digit) and in String literal (it can be used like "\\n" to represent line separator or \\" to escape double quote symbol which normally would represent end of string literal). 问题是\\是正则表达式中的特殊字符(可以像\\d用来表示数字)和String文字(可以像"\\n"来代表行分隔符或\\"来代替双引号),这通常是特殊字符表示字符串文字的结尾)。

In both these cases to create \\ symbol we can escape it (make it literal instead of special character) by placing additional \\ before it (like we escape " in string literals via \\" ). 在这两种情况下,为了创建\\符号,我们都可以通过在其前面放置其他\\来对其进行转义 (使它成为文字而不是特殊字符)(就像我们通过"在字符串文字中通过\\" "进行转义)一样。

So to target regex representing \\ symbol will need to hold \\\\ , and string literal representing such text will need to look like "\\\\\\\\" . 因此,要将表示\\符号的正则表达式作为target将需要保留\\\\ ,而表示此类文本的字符串文字则需要看起来像"\\\\\\\\"

So we escaped \\ twice: 所以我们逃脱了\\两次:

  • once in regex \\\\ 曾经在正则表达式\\\\
  • once in String literal "\\\\\\\\" (each \\ is represented as "\\\\" ). 一次在字符串文字"\\\\\\\\" (每个\\表示为"\\\\" )。

In case of replacement \\ is also special there. 如果要replacement \\那里也很特别。 It allows us to escape other special character $ which via $x notation, allows us to use portion of data matched by regex and held by capturing group indexed as x , like "012".replaceAll("(\\\\d)", "$1$1") will match each digit, place it in capturing group 1 and $1$1 will replace it with its two copies (it will duplicate it) resulting in "001122" . 它允许我们转义其他通过$x表示的特殊字符$ ,允许我们使用与regex匹配并通过捕获索引为x组来保存的数据部分,例如"012".replaceAll("(\\\\d)", "$1$1")将匹配每个数字,将其放置在捕获组1中, $1$1将用其两个副本替换它(将复制它),结果为"001122"

So again, to let replacement represent \\ literal we need to escape it with additional \\ which means that: 同样,要让replacement表示\\文字,我们需要使用附加的\\对其进行转义,这意味着:

  • replacement must hold two backslash characters \\\\ 替换必须包含两个反斜杠字符\\\\
  • and String literal which represents \\\\ looks like "\\\\\\\\" 和表示\\\\字符串文字,看起来像"\\\\\\\\"

BUT since we want replacement to hold two backslashes we will need "\\\\\\\\\\\\\\\\" (each \\ represented by one "\\\\\\\\" ). 但是,由于我们希望replacement保留两个反斜杠,因此我们需要"\\\\\\\\\\\\\\\\" (每个\\用一个"\\\\\\\\" )。

So version with replaceAll can look like 所以带有replaceAll版本看起来像

replaceAll("\\\\", "\\\\\\\\");

Easier way 更简单的方法

To make out life easier Java provides tools to automatically escape text into target and replacement parts. 为了使生活更轻松,Java提供了一些工具来自动将文本转义为targetreplacement零件。 So now we can focus only on strings, and forget about regex syntax: 所以现在我们只关注字符串,而忽略正则表达式语法:

replaceAll(Pattern.quote(target), Matcher.quoteReplacement(replacement))

which in our case can look like 在我们的情况下看起来像

replaceAll(Pattern.quote("\\"), Matcher.quoteReplacement("\\\\"))

Even better 更好

If we don't really need regex syntax support lets not involve replaceAll at all. 如果我们真的不需要正则表达式语法支持,那就根本不要涉及replaceAll Instead lets use replace . 相反,让我们使用replace Both methods will replace all target s, but replace doesn't involve regex syntax. 两种方法都将替换所有 target ,但是replace不涉及正则表达式语法。 So you could simply write 所以你可以简单地写

theString = theString.replace("\\", "\\\\");

You'll need to escape the (escaped) backslash in the first argument as it is a regular expression. 您需要在第一个参数中转义(转义的)反斜杠,因为它是一个正则表达式。 Replacement (2nd argument - see Matcher#replaceAll(String) ) also has it's special meaning of backslashes, so you'll have to replace those to: 替换(第二个参数-参见Matcher#replaceAll(String) )也具有反斜杠的特殊含义,因此您必须将其替换为:

theString.replaceAll("\\\\", "\\\\\\\\");

Yes... by the time the regex compiler sees the pattern you've given it, it sees only a single backslash (since Java's lexer has turned the double backwhack into a single one). 是的。正则表达式编译器看到您指定的模式时,只会看到一个反斜杠(因为Java的词法分析器已将双重反斜杠变成了一个反斜杠)。 You need to replace "\\\\\\\\" with "\\\\\\\\" , believe it or not! 您需要用"\\\\\\\\"替换"\\\\\\\\" ,信不信由你! Java really needs a good raw string syntax. Java确实需要良好的原始字符串语法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM