[英]Removing all non letter characters from a string in C#
I want to remove all non letter characters from a string.我想从字符串中删除所有非字母字符。 When I say all letters I mean anything that isn't in the alphabet, or an apostrophe.
当我说所有字母时,我指的是字母表或撇号之外的任何内容。 This is the code I have.
这是我的代码。
public static string RemoveBadChars(string word)
{
char[] chars = new char[word.Length];
for (int i = 0; i < word.Length; i++)
{
char c = word[i];
if ((int)c >= 65 && (int)c <= 90)
{
chars[i] = c;
}
else if ((int)c >= 97 && (int)c <= 122)
{
chars[i] = c;
}
else if ((int)c == 44)
{
chars[i] = c;
}
}
word = new string(chars);
return word;
}
It's close, but doesn't quite work.它很接近,但并不完全有效。 The problem is this:
问题是这样的:
[in]: "(the"
[out]: " the"
It gives me a space there instead of the "(". I want to remove the character entirely.它给了我一个空格而不是“(”。我想完全删除这个字符。
The Char
class has a method that could help out. Char
类有一个方法可以提供帮助。 UseChar.IsLetter()
to detect valid letters (and an additional check for the apostrophe) , then pass the result to the string
constructor:使用
Char.IsLetter()
检测有效字母(并额外检查撇号) ,然后将结果传递给string
构造函数:
var input = "(the;':";
var result = new string(input.Where(c => Char.IsLetter(c) || c == '\'').ToArray());
Output:输出:
the'
这'
You should use Regular Expression (Regex) instead.您应该改用正则表达式 (Regex) 。
public static string RemoveBadChars(string word)
{
Regex reg = new Regex("[^a-zA-Z']");
return reg.Replace(word, string.Empty);
}
If you don't want to replace spaces:如果您不想替换空格:
Regex reg = new Regex("[^a-zA-Z' ]");
A regular expression would be better as this is pretty inefficient, but to answer your question, the problem with your code is that you should use a different variable other than i inside your for loop.正则表达式会更好,因为这非常低效,但要回答您的问题,您的代码的问题是您应该在 for 循环中使用 i 以外的其他变量。 So, something like this:
所以,像这样:
public static string RemoveBadChars(string word)
{
char[] chars = new char[word.Length];
int myindex=0;
for (int i = 0; i < word.Length; i++)
{
char c = word[i];
if ((int)c >= 65 && (int)c <= 90)
{
chars[myindex] = c;
myindex++;
}
else if ((int)c >= 97 && (int)c <= 122)
{
chars[myindex] = c;
myindex++;
}
else if ((int)c == 44)
{
chars[myindex] = c;
myindex++;
}
}
word = new string(chars);
return word;
}
This is the working answer, he says he want to remove none-letters chars这是有效的答案,他说他想删除非字母字符
public static string RemoveNoneLetterChars(string word)
{
Regex reg = new Regex(@"\W");
return reg.Replace(word, " "); // or return reg.Replace(word, String.Empty);
}
private static Regex badChars = new Regex("[^A-Za-z']");
public static string RemoveBadChars(string word)
{
return badChars.Replace(word, "");
}
This creates a Regular Expression that consists of a character class (enclosed in square brackets) that looks for anything that is not (the leading ^
inside the character class) AZ, az, or '.这将创建一个由字符类(括在方括号中)组成的正则表达式,用于查找不是(字符类中的前导
^
)AZ、az 或 ' 的任何内容。 It then defines a function that replaces anything that matches the expression with an empty string.然后定义一个函数,用空字符串替换与表达式匹配的任何内容。
word.Aggregate(new StringBuilder(word.Length), (acc, c) => acc.Append(Char.IsLetter(c) ? c.ToString() : "")).ToString();
或者您可以用任何函数代替 IsLetter。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.