[英]How to replace characters in a java String?
I like to replace a certain set of characters of a string with a corresponding replacement character in an efficent way. 我喜欢以有效的方式用相应的替换字符替换字符串的某组字符。
For example: 例如:
String sourceCharacters = "šđćčŠĐĆČžŽ";
String targetCharacters = "sdccSDCCzZ";
String result = replaceChars("Gračišće", sourceCharacters , targetCharacters );
Assert.equals(result,"Gracisce") == true;
Is there are more efficient way than to use the replaceAll
method of the String class? 有没有比使用String类的
replaceAll
方法更有效的方法?
My first idea was: 我的第一个想法是:
final String s = "Gračišće";
String sourceCharacters = "šđćčŠĐĆČžŽ";
String targetCharacters = "sdccSDCCzZ";
// preparation
final char[] sourceString = s.toCharArray();
final char result[] = new char[sourceString.length];
final char[] targetCharactersArray = targetCharacters.toCharArray();
// main work
for(int i=0,l=sourceString.length;i<l;++i)
{
final int pos = sourceCharacters.indexOf(sourceString[i]);
result[i] = pos!=-1 ? targetCharactersArray[pos] : sourceString[i];
}
// result
String resultString = new String(result);
Any ideas? 有任何想法吗?
Btw, the UTF-8 characters are causing the trouble, with US_ASCII it works fine. 顺便说一句,UTF-8字符引起麻烦,US_ASCII可以正常工作。
You can make use of java.text.Normalizer
and a shot of regex to get rid of the diacritics of which there exist much more than you have collected as far. 您可以使用
java.text.Normalizer
和正则表达式的镜头来摆脱存在的变音符号 , 远远超过您收集的变音符号 。
Here's an SSCCE , copy'n'paste'n'run it on Java 6: 这是一个SSCCE ,在Java 6上复制' n'paste'n'run它:
package com.stackoverflow.q2653739;
import java.text.Normalizer;
import java.text.Normalizer.Form;
public class Test {
public static void main(String... args) {
System.out.println(removeDiacriticalMarks("Gračišće"));
}
public static String removeDiacriticalMarks(String string) {
return Normalizer.normalize(string, Form.NFD)
.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
}
}
This should yield 这应该产生
Gracisce
At least, it does here at Eclipse with console character encoding set to UTF-8 ( Window > Preferences > General > Workspace > Text File Encoding ). 至少,它在Eclipse中将控制台字符编码设置为UTF-8( Window> Preferences> General> Workspace> Text File Encoding )。 Ensure that the same is set in your environment as well.
确保在您的环境中也设置了相同的设置。
As an alternative, maintain a Map<Character, Character>
: 作为替代方案,维护
Map<Character, Character>
:
Map<Character, Character> charReplacementMap = new HashMap<Character, Character>();
charReplacementMap.put('š', 's');
charReplacementMap.put('đ', 'd');
// Put more here.
String originalString = "Gračišće";
StringBuilder builder = new StringBuilder();
for (char currentChar : originalString.toCharArray()) {
Character replacementChar = charReplacementMap.get(currentChar);
builder.append(replacementChar != null ? replacementChar : currentChar);
}
String newString = builder.toString();
I'd use the replace
method in a simple loop. 我在一个简单的循环中使用
replace
方法。
String sourceCharacters = "šđćčŠĐĆČžŽ";
String targetCharacters = "sdccSDCCzZ";
String s = "Gračišće";
for (int i=0 ; i<sourceCharacters.length() ; i++)
s = s.replace(sourceCharacters.charAt[i], targetCharacters.charAt[i]);
System.out.println(s);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.