[英]Regex match if a string has length 2 and contains 1 letter and 1 number
Guys I hate Regex and I suck at writing. 伙计们,我讨厌Regex,在写作方面我很烂。
I have a string that is space separated and contains several codes that I need to pull out. 我有一个用空格分隔的字符串,其中包含一些我需要提取的代码。 Each code is marked by beginning with a capital letter and ending with a number. 每个代码都以大写字母开头并以数字结尾。 The code is only two digits. 密码只有两位数。
I'm trying to create an array of strings from the initial string and I can't get the regular expression right. 我正在尝试从初始字符串创建字符串数组,但我无法正确获取正则表达式。
Here is what I have 这是我所拥有的
String[] test = Regex.Split(originalText, "([a-zA-Z0-9]{2})");
I also tried: 我也尝试过:
String[] test = Regex.Split(originalText, "([A-Z]{1}[0-9]{1})");
I don't have any experience with Regex as I try to avoid writing them whenever possible. 我没有使用Regex的经验,因为我会尽量避免编写它们。
Anyone have any suggestions? 有人有什么建议吗?
Example input: 输入示例:
AA2410 F7 A4 Y7 B7 A 0715 0836 E0.M80 AA2410 F7 A4 Y7 B7 A 0715 0836 E0.M80
I need to pull out F7, A4, B7. 我需要拔出F7,A4,B7。 E0 should be ignored. E0应该被忽略。
You want to collect the results, not split on them, right? 您想收集结果,而不是对结果进行拆分,对吧?
Regex regexObj = new Regex(@"\b[A-Z][0-9]\b");
allMatchResults = regexObj.Matches(subjectString);
should do this. 应该这样做。 The \\b
s are word boundaries, making sure that only entire strings (like A1
) are extracted, not substrings (like the A1
in TWA101
). \\b
是单词边界,请确保仅提取整个字符串(如A1
),而不提取子字符串(如TWA101
的A1
)。
If you also need to exclude "words" with non-word characters in them (like E0.M80
in your comment), you need to define your own word boundary, for example: 如果还需要排除其中包含非单词字符的“单词”(例如E0.M80
中的E0.M80
),则需要定义自己的单词边界,例如:
Regex regexObj = new Regex(@"(?<=^|\s)[A-Z][0-9](?=\s|$)");
Now A1
only matches when surrounded by whitespace (or start/end-of-string positions). 现在, A1
仅在被空格(或字符串的开始/结束位置)包围时才匹配。
Explanation: 说明:
(?<= # Assert that we can match the following before the current position:
^ # Start of string
| # or
\s # whitespace.
)
[A-Z] # Match an uppercase ASCII letter
[0-9] # Match an ASCII digit
(?= # Assert that we can match the following after the current position:
\s # Whitespace
| # or
$ # end of string.
)
If you also need to find non-ASCII letters/digits, you can use 如果您还需要查找非ASCII字母/数字,则可以使用
\p{Lu}\p{N}
instead of [AZ][0-9]
. 而不是[AZ][0-9]
。 This finds all uppercase Unicode letters and Unicode digits (like Ä٣
), but I guess that's not really what you're after, is it? 它可以找到所有大写的Unicode字母和Unicode数字(例如Ä٣
),但是我想那不是您真正想要的,是吗?
Do you mean that each code looks like "A00"? 您是说每个代码看起来都像“ A00”吗?
Then this is the regex: 然后是正则表达式:
"[AZ][0-9][0-9]"
Very simple... By the way, there's no point writing {1}
in a regex. 非常简单...顺便说一句,在正则表达式中写{1}
是没有意义的。 [0-9]{1}
means "match exactly one digit, which is exactly like writing [0-9]
. [0-9]{1}
意思是“正好匹配一位,就像写[0-9]
。
Don't give up, simple regexes make perfect sense. 不要放弃,简单的正则表达式非常合理。
This should be ok: 这应该没问题:
String[] all_codes = Regex.Split(originalText, @"\b[A-Z]\d\b");
It gives you an array with all code starting with a capital letter followed by a digit, separated by an kind of word boundary (site space etc.) 它为您提供了一个数组,其中所有代码均以大写字母开头,后跟一个数字,并由一种单词边界(站点空间等)分隔。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.