简体   繁体   中英

How to Split at Spaces and Other Non-Alphanumeric Character

I'm setting up a new project in C# , and want to split string with this character but stay return empty space

 public static List<string> Tokinization(string stringy)
        {
            List<string> terms=new List<string>();
            char[] seps = new char[] { ' ',',','.','\n',};
            foreach (string ss in stringy.Split(seps))
            {
                terms.Add(ss);
            }
            return terms;
        }

The input is :

stringy="Mountain bike"

The actual result is :

terms{"","","",.........,"","Mountain","bike"}

However, I expect the output of terms{"Mountain","bike"}

If you want to split at a whole newline sequence and not just '\\n' you can use the overload taking a string array and options as argument.

public static List<string> Tokinization(string stringy)
{
    List<string> terms = new List<string>();
    foreach (string ss in stringy.Split(new string[] { " ", ",", ".", Environment.NewLine }, StringSplitOptions.None))
    {
        terms.Add(ss);
    }
    return terms;
}

If you furthermore want to omit empty tokens in general you can also use the respective option.

public static List<string> Tokinization(string stringy)
{
    List<string> terms = new List<string>();
    foreach (string ss in stringy.Split(new string[] { " ", ",", ".", Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries))
    {
        terms.Add(ss);
    }
    return terms;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM