简体   繁体   English

如何从字符串中分离字符和数字部分

[英]How to separate character and number part from string

Eg, I would like to separate:例如,我想分开:

  • OS234 to OS and 234 OS234OS234
  • AA4230 to AA and 4230 AA4230AA4230

I have used following trivial solution, but I am quite sure that there should be a more efficient and robust solution .我使用了以下简单的解决方案,但我很确定应该有一个更有效和更强大的解决方案。

private void demo()
    {   string cell="ABCD4321";
        int a = getIndexofNumber(cell);
        string Numberpart = cell.Substring(a, cell.Length - a);
        row = Convert.ToInt32(rowpart);
        string Stringpart = cell.Substring(0, a);
    }

private int getIndexofNumber(string cell)
        {
            int a = -1, indexofNum = 10000;
            a = cell.IndexOf("0"); if (a > -1) { if (indexofNum > a) { indexofNum = a; } }
            a = cell.IndexOf("1"); if (a > -1) { if (indexofNum > a) { indexofNum = a; } }
            a = cell.IndexOf("2"); if (a > -1) { if (indexofNum > a) { indexofNum = a; } }
            a = cell.IndexOf("3"); if (a > -1) { if (indexofNum > a) { indexofNum = a; } }
            a = cell.IndexOf("4"); if (a > -1) { if (indexofNum > a) { indexofNum = a; } }
            a = cell.IndexOf("5"); if (a > -1) { if (indexofNum > a) { indexofNum = a; } }
            a = cell.IndexOf("6"); if (a > -1) { if (indexofNum > a) { indexofNum = a; } }
            a = cell.IndexOf("7"); if (a > -1) { if (indexofNum > a) { indexofNum = a; } }
            a = cell.IndexOf("8"); if (a > -1) { if (indexofNum > a) { indexofNum = a; } }
            a = cell.IndexOf("9"); if (a > -1) { if (indexofNum > a) { indexofNum = a; } }

            if (indexofNum != 10000)
            { return indexofNum; }
            else
            { return 0; }


        }

Regular Expressions are best suited for this kind of work:正则表达式最适合这种工作:

using System.Text.RegularExpressions;

Regex re = new Regex(@"([a-zA-Z]+)(\d+)");
Match result = re.Match(input);

string alphaPart = result.Groups[1].Value;
string numberPart = result.Groups[2].Value;

Use Linq to do this使用 Linq 来做到这一点

string str = "OS234";

var digits = from c in str
             select c
             where Char.IsDigit(c);

var alphas = from c in str
             select c
             where !Char.IsDigit(c);

Everyone and their mother will give you a solution using regex, so here's one that is not:每个人和他们的母亲都会使用正则表达式为您提供解决方案,所以这里有一个不是:

 // s is string of form ([A-Za-z])*([0-9])* ; char added
 int index = s.IndexOfAny(new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' });
 string chars = s.Substring(0, index);
 int num = Int32.Parse(s.Substring(index));

If you want resolve more occurrences of char followed by number or vice versa you can use如果您想解决更多出现的字符后跟数字,反之亦然,您可以使用

private string SplitCharsAndNums(string text)
{
    var sb = new StringBuilder();
    for (var i = 0; i < text.Length - 1; i++)
    {
        if ((char.IsLetter(text[i]) && char.IsDigit(text[i+1])) ||
            (char.IsDigit(text[i]) && char.IsLetter(text[i+1])))
        {
            sb.Append(text[i]);
            sb.Append(" ");
        }
        else
        {
            sb.Append(text[i]);
        }
    }

    sb.Append(text[text.Length-1]);

    return sb.ToString();
}

And then然后

var text = SplitCharsAndNums("asd1 asas4gr5 6ssfd");
var tokens = text.Split(' ');

I really like jason's answer .我真的很喜欢杰森的回答 Lets improve it a bit.让我们稍微改进一下。 We dont need regex here.我们这里不需要正则表达式。 My solution handle input like "H1N1":我的解决方案处理像“H1N1”这样的输入:

public static IEnumerable<string> SplitAlpha(string input)
{
    var words = new List<string> { string.Empty };
    for (var i = 0; i < input.Length; i++)
    {
        words[words.Count-1] += input[i];
        if (i + 1 < input.Length && char.IsLetter(input[i]) != char.IsLetter(input[i + 1]))
        {
            words.Add(string.Empty);
        }
    }
    return words;
}

This solution is linear O(n).这个解是线性的 O(n)。

ouput输出

"H1N1" -> ["H", "1", "N", "1"]
"H" -> ["H"]
"GH1N12" -> ["GH", "1", "N", "12"]
"OS234" -> ["OS", "234"]

Same solution with a StringBuilderStringBuilder相同的解决方案

public static IEnumerable<string> SplitAlpha(string input)
{
    var words = new List<StringBuilder>{new StringBuilder()};
    for (var i = 0; i < input.Length; i++)
    {
        words[words.Count - 1].Append(input[i]);
        if (i + 1 < input.Length && char.IsLetter(input[i]) != char.IsLetter(input[i + 1]))
        {
            words.Add(new StringBuilder());
        }
    }

    return words.Select(x => x.ToString());
}

Try it Online!在线试用!

Are you doing this for sorting purposes?你这样做是为了排序吗? If so, keep in mind that Regex can kill performance for large lists.如果是这样,请记住 Regex 可能会降低大型列表的性能。 I frequently use an AlphanumComparer that's a general solution to this problem (can handle any sequence of letters and numbers in any order).我经常使用AlphanumComparer来解决这个问题(可以处理任何顺序的任何字母和数字序列)。 I believe that I adapted it from this page .我相信我是从这个页面改编的。

Even if you're not sorting on it, using the character-by-character approach (if you have variable lengths) or simple substring/parse (if they're fixed) will be a lot more efficient and easier to test than a Regex.即使您没有对其进行排序,使用逐个字符的方法(如果您有可变长度)或简单的子字符串/解析(如果它们是固定的)将比正则表达式更有效且更易于测试.

.NET 2.0 compatible, without regex .NET 2.0 兼容,无正则表达式

public class Result
{
    private string _StringPart;
    public string StringPart
    {
        get { return _StringPart; }
    }

    private int _IntPart;
    public int IntPart
    {
        get { return _IntPart; }
    }

    public Result(string stringPart, int intPart)
    {
        _StringPart = stringPart;
        _IntPart = intPart;
    }
}

class Program
{
    public static Result GetResult(string source)
    {
        string stringPart = String.Empty;
        int intPart;
        var buffer = new StringBuilder();
        foreach (char c in source)
        {
            if (Char.IsDigit(c))
            {
               if (stringPart == String.Empty)
               {
                    stringPart = buffer.ToString();
                    buffer.Remove(0, buffer.Length);
                }
            }

            buffer.Append(c);
        }

        if (!int.TryParse(buffer.ToString(), out intPart))
        {
            return null;
        }

        return new Result(stringPart, intPart); 
    }

    static void Main(string[] args)
    {
        Result result = GetResult("OS234");
        Console.WriteLine("String part: {0} int part: {1}", result.StringPart, result.IntPart);
        result = GetResult("AA4230 ");
        Console.WriteLine("String part: {0} int part: {1}", result.StringPart, result.IntPart);
        result = GetResult("ABCD4321");
        Console.WriteLine("String part: {0} int part: {1}", result.StringPart, result.IntPart);
        Console.ReadKey();
    }
}

I have used bniwredyc's answer to get Improved version of my routine:我已经使用 bniwredyc 的答案来获得我的例程的改进版本:

    private void demo()
        {
            string cell = "ABCD4321";
            int row, a = getIndexofNumber(cell);
            string Numberpart = cell.Substring(a, cell.Length - a);
            row = Convert.ToInt32(Numberpart);
            string Stringpart = cell.Substring(0, a);
        }

        private int getIndexofNumber(string cell)
        {
            int indexofNum=-1;
            foreach (char c in cell)
            {
                indexofNum++;
                if (Char.IsDigit(c))
                {
                    return indexofNum;
                }
             }
            return indexofNum;
        }

Just use the substring function and set position inside the bracket.只需使用substring函数并在括号内设置位置。

 String id = "DON123";
 System.out.println("Id nubmer is : "+id.substring(3,6));

Answer:回答:

 Id number is: 123

use Split to seprate string from sting that use tab \\t and space使用 Split 将字符串与使用制表符 \\t 和空格的字符串分开

string s = "sometext\tsometext\tsometext";
string[] split = s.Split('\t');

now you have an array of string that you want too easy现在你有一个你想要太容易的字符串数组

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM