简体   繁体   English

如何在数字和子串上拆分字符串?

[英]How to split a string on numbers and it substrings?

How to split a string on numbers and substrings? 如何在数字和子串上拆分字符串?

Input: 34AG34A 输入: 34AG34A
Expected output: {"34","AG","34","A"} 预期输出: {"34","AG","34","A"}

I have tried with Regex.Split() function, but I can not figure out what pattern would work. 我尝试过使用Regex.Split()函数,但我无法弄清楚哪种模式可行。

Any ideas? 有任何想法吗?

正则表达式(\\d+|[A-Za-z]+)将返回您需要的组。

I think you have to look for two patterns: 我认为你必须寻找两种模式:

  • a sequence of digits 一系列数字
  • a sequence of letters 一系列字母

Hence, I'd use ([az]+)|([0-9]+) . 因此,我会使用([az]+)|([0-9]+)

For instance, System.Text.RegularExpressions.Regex.Matches("asdf1234be56qq78", "([az]+)|([0-9]+)") returns 6 groups, containing "asdf", "1234", "be", "56", "qq", "78". 例如, System.Text.RegularExpressions.Regex.Matches("asdf1234be56qq78", "([az]+)|([0-9]+)")返回6组,包含“asdf”,“1234”,“be “,”56“,”qq“,”78“。

First, you ask for "numbers" but don't specify what you mean by that. 首先,你要求“数字”但不指明你的意思。

If you mean "digits in 0-9" then you need the character class [0-9] . 如果你的意思是“0-9中的数字”,那么你需要字符类[0-9] There is also the character class \\d which in addition to 0-9 matches some other characters. 除了0-9之外还有一些字符类\\d匹配其他一些字符。

\\d matches any decimal digit. \\ d匹配任何十进制数字。 It is equivalent to the \\p{Nd} regular expression pattern, which includes the standard decimal digits 0-9 as well as the decimal digits of a number of other character sets. 它等同于\\ p {Nd}正则表达式模式,它包括标准的十进制数字0-9以及许多其他字符集的十进制数字。

I assume that you are not interested in negative numbers, numbers containing a decimal point, foreign numerals such as 五, etc. 我假设您对负数,包含小数点的数字,诸如五等的外国数字不感兴趣。

Split is not the right solution here. 拆分不是这里的正确解决方案。 What you appear to want to do is tokenize the string, not split it. 您似乎想要做的是对字符串进行标记,而不是将其拆分。 You can do this by using Matches instead of Split : 您可以使用Matches而不是Split来执行此操作:

string[] output = Regex.Matches(s, "[0-9]+|[^0-9]+")
    .Cast<Match>()
    .Select(match => match.Value)
    .ToArray();

Don't use Regex.Split, use Regex.Match: 不要使用Regex.Split,请使用Regex.Match:

var m = Regex.Match("34AG34A", "([0-9]+|[A-Z]+)");
while (m.Success) {
    Console.WriteLine(m);
    m = m.NextMatch();
}

Converting this to an array is left as an exercise to the reader. 将其转换为数组留给读者练习。 :-) :-)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM