简体   繁体   English

从C#中的字符串中删除所有非字母字符

[英]Removing all non letter characters from a string in C#

I want to remove all non letter characters from a string.我想从字符串中删除所有非字母字符。 When I say all letters I mean anything that isn't in the alphabet, or an apostrophe.当我说所有字母时,我指的是字母表或撇号之外的任何内容。 This is the code I have.这是我的代码。

public static string RemoveBadChars(string word)
{
    char[] chars = new char[word.Length];
    for (int i = 0; i < word.Length; i++)
    {
        char c = word[i];

        if ((int)c >= 65 && (int)c <= 90)
        {
            chars[i] = c;
        }
        else if ((int)c >= 97 && (int)c <= 122)
        {
            chars[i] = c;
        }
        else if ((int)c == 44)
        {
            chars[i] = c;
        }
    }

    word = new string(chars);

    return word;
}

It's close, but doesn't quite work.它很接近,但并不完全有效。 The problem is this:问题是这样的:

[in]: "(the"
[out]: " the"

It gives me a space there instead of the "(". I want to remove the character entirely.它给了我一个空格而不是“(”。我想完全删除这个字符。

The Char class has a method that could help out. Char类有一个方法可以提供帮助。 UseChar.IsLetter() to detect valid letters (and an additional check for the apostrophe) , then pass the result to the string constructor:使用Char.IsLetter()检测有效字母(并额外检查撇号) ,然后将结果传递给string构造函数:

var input = "(the;':";

var result = new string(input.Where(c => Char.IsLetter(c) || c == '\'').ToArray());

Output:输出:

the'这'

You should use Regular Expression (Regex) instead.您应该改用正则表达式 (Regex)

public static string RemoveBadChars(string word)
{
    Regex reg = new Regex("[^a-zA-Z']");
    return reg.Replace(word, string.Empty);
}

If you don't want to replace spaces:如果您不想替换空格:

Regex reg = new Regex("[^a-zA-Z' ]");

A regular expression would be better as this is pretty inefficient, but to answer your question, the problem with your code is that you should use a different variable other than i inside your for loop.正则表达式会更好,因为这非常低效,但要回答您的问题,您的代码的问题是您应该在 for 循环中使用 i 以外的其他变量。 So, something like this:所以,像这样:

public static string RemoveBadChars(string word)
{
    char[] chars = new char[word.Length];
    int myindex=0;
    for (int i = 0; i < word.Length; i++)
    {
        char c = word[i];

        if ((int)c >= 65 && (int)c <= 90)
        {
            chars[myindex] = c;
            myindex++;
        }
        else if ((int)c >= 97 && (int)c <= 122)
        {
            chars[myindex] = c;
            myindex++;
        }
        else if ((int)c == 44)
        {
            chars[myindex] = c;
            myindex++;
        }
    }

    word = new string(chars);

    return word;
}

This is the working answer, he says he want to remove none-letters chars这是有效的答案,他说他想删除非字母字符

public static string RemoveNoneLetterChars(string word)
{
    Regex reg = new Regex(@"\W");
    return reg.Replace(word, " "); // or return reg.Replace(word, String.Empty); 
}
private static Regex badChars = new Regex("[^A-Za-z']");

public static string RemoveBadChars(string word)
{
    return badChars.Replace(word, "");
}

This creates a Regular Expression that consists of a character class (enclosed in square brackets) that looks for anything that is not (the leading ^ inside the character class) AZ, az, or '.这将创建一个由字符类(括在方括号中)组成的正则表达式,用于查找不是(字符类中的前导^ )AZ、az 或 ' 的任何内容。 It then defines a function that replaces anything that matches the expression with an empty string.然后定义一个函数,用空字符串替换与表达式匹配的任何内容。

word.Aggregate(new StringBuilder(word.Length), (acc, c) => acc.Append(Char.IsLetter(c) ? c.ToString() : "")).ToString();

或者您可以用任何函数代替 IsLetter。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM