以更有效的方式替换字符串中的一组子字符串？

Question

I've to replace a set of substrings in a String with another substrings for example 我要用String中的一组子串替换另一个子串，例如

"^t" with "\\t" "^t"和"\\t"
"^=" with "\—" "^="与"\—"
"^+" with "\–" "^+"与"\–"
"^s" with "\ " 带有"\ " "^s" "\ "
"^?" with "." 用"."
"^#" with "\\\\d" "^#"与"\\\\d"
"^$" with "[a-zA-Z]" 带"[a-zA-Z]" "^$" "[a-zA-Z]"

So, I've tried with: 所以，我试过：

String oppip = "pippo^t^# p^+alt^shefhjkhfjkdgfkagfafdjgbcnbch^";

Map<String,String> tokens = new HashMap<String,String>();
tokens.put("^t", "\t");
tokens.put("^=", "\u2014");
tokens.put("^+", "\u2013");
tokens.put("^s", "\u00A0");
tokens.put("^?", ".");
tokens.put("^#", "\\d");
tokens.put("^$", "[a-zA-Z]");

String regexp = "^t|^=|^+|^s|^?|^#|^$";

StringBuffer sb = new StringBuffer();
Pattern p = Pattern.compile(regexp);
Matcher m = p.matcher(oppip);
while (m.find())
    m.appendReplacement(sb, tokens.get(m.group()));
m.appendTail(sb);
System.out.println(sb.toString());

But it doesn't work. 但它不起作用。 tokens.get(m.group()) throws an exception. tokens.get(m.group())抛出异常。

Any idea why? 知道为什么吗？

Answer 1

You don't have to use a HashMap . 您不必使用HashMap 。 Consider using simple arrays, and a loop : 考虑使用简单数组和循环：

String oppip = "pippo^t^# p^+alt^shefhjkhfjkdgfkagfafdjgbcnbch^";

String[] searchFor =
{"^t", "^=", "^+", "^s", "^?", "^#", "^$"},
         replacement =
{"\\t", "\\u2014", "\\u2013", "\\u00A0", ".", "\\d", "[a-zA-Z]"};

for (int i = 0; i < searchFor.length; i++)
    oppip = oppip.replace(searchFor[i], replacement[i]);

// Print the result.
System.out.println(oppip);

Here is an online code demo . 这是一个在线代码演示。

For the completeness, you can use a two-dimensional array for a similar approach: 为了完整性，您可以使用二维数组来实现类似的方法：

String oppip = "pippo^t^# p^+alt^shefhjkhfjkdgfkagfafdjgbcnbch^";

String[][] tasks =
{
    {"^t", "\\t"},
    {"^=", "\\u2014"}, 
    {"^+", "\\u2013"}, 
    {"^s", "\\u00A0"}, 
    {"^?", "."}, 
    {"^#", "\\d"}, 
    {"^$", "[a-zA-Z]"}
};

for (String[] replacement : tasks)
    oppip = oppip.replace(replacement[0], replacement[1]);

// Print the result.
System.out.println(oppip);

Answer 2

In regex the ^ means "begin-of-text" (or "not" within a character class as negation). 在正则表达式中， ^表示“文本开头”（或者在字符类中“不”作为否定）。 You have to place a backslash before it, which becomes two backslashes in a java String. 你必须在它之前放一个反斜杠，它在java String中变成两个反斜杠。

String regexp = "\\^[t=+s?#$]";

I have reduced it a bit further. 我进一步减少了它。

以更有效的方式替换字符串中的一组子字符串？

问题描述

2 个解决方案

解决方案1
6 2014-09-18 13:37:14

解决方案2
5 2014-09-18 13:36:33

以更有效的方式替换字符串中的一组子字符串？

问题描述

2 个解决方案

解决方案1 6 2014-09-18 13:37:14

解决方案2 5 2014-09-18 13:36:33

解决方案1
6 2014-09-18 13:37:14

解决方案2
5 2014-09-18 13:36:33