[英]Filtering on full string match but not on substrings
So I've got a long string of numbers and characters and I'd like to filter out a substring. 所以我有一长串的数字和字符,我想过滤掉一个子字符串。 The thing I'm struggling with is that I need a full match on a certain value (starting with S) but this may not be matched in another value. 我正在努力的事情是我需要一个特定值的完全匹配(从S开始),但这可能与另一个值不匹配。
Input: 输入:
S10 1+0000000297472+00EURS100 1+0000000297472+00EURS1023P 1+0000000816072+00EUR
The input is exactly like this. 输入完全是这样的。
Breakdown of input: 输入细分:
S10 1+0000000297472+00EUR
=> =>
I need to match on for example S10 and I only want this substring till EUR. 我需要匹配例如S10,我只想要这个子串直到EUR。 I don't want it to match on S100 or S1023P or any other combination. 我不希望它在S100或S1023P或任何其他组合上匹配。 Only on exactly S10 仅在S10上
Output: 输出:
S10 1+0000000297472+00EUR
I'm trying to use Regex to find my match on 'S + code'. 我正在尝试使用Regex在'S +代码'上找到我的匹配。 I'm doing a full match on my search query and then as soon as anything follows I don't want it anymore. 我正在对我的搜索查询进行完全匹配,然后只要有任何后续内容我就不再需要了。 But doing it like this also discards the actual match as after the S10 the value will follow which will match with [^\\d|^\\D])+\\w 但这样做也会丢弃实际的匹配,因为S10之后的值会跟随[^ \\ d | ^ \\ D])+ \\ w
foreach (var field in fieldList)
{
var query = "S" + field.BallanceCode;
var index = Regex.Match(values, Regex.Escape(query) + @"([^\d|^\D])+\w").Index;
}
For example when looking for S10 例如,在寻找S10时
needs to match: 需要匹配:
S10 1+0000000297472+00EUR
may not match: 可能不匹配:
S10/15 1+0000001748447+00EUR
S1023P 1+0000000816072+00EUR
S10000001+0000000546546+00EUR
Update: 更新:
Using this code 使用此代码
var index = Regex.Match(values, Regex.Escape(query) + @"\p{Zs}.*?EUR").Index;
wil yield S10, S10/15, etc when looked for. 当寻找时,将产生S10,S10 / 15等。 However looking for S1000000 in the string doesn't work because there is no whitespace between the code and 1+ 但是在字符串中查找S1000000不起作用,因为代码和1+之间没有空格
S1000000 1+0000000546546+00EUR S1000000 1 + 0000000546546 + 00EUR
For example when looking for S1000000 例如,在寻找S1000000时
needs to match: 需要匹配:
S10000001+0000000297472+00EUR
may not match: 可能不匹配:
S10 1+0000001748447+00EUR
S1023P 1+0000000816072+00EUR
S10/15 1+0000000546546+00EUR
You can use a regex that requires a space (or whitespace) to appear right after the field.BallanceCode
: 您可以使用需要空格(或空格)的正则表达式出现在field.BallanceCode
:
var index = Regex.Match(values, Regex.Escape(query) + (field.BallanceCode.Length < 7 ? @"\p{Zs}" : "") + ".*?EUR").Index;
The regex will match the S10
, then any horizontal whitespace ( \\p{Zs}
), then any 0 or more characters other than a newline (as few as possible due to *?
) up to the first EUR
. 正则表达式将匹配S10
,然后是任何水平空格( \\p{Zs}
),然后是换行符以外的任何0个或更多字符(由于*?
而尽可能少)直到第一个EUR
。
The (field.BallanceCode.Length < 7 ? @"\\p{Zs}" : "")
check is necessary to support a 7-digit BallanceCode
. (field.BallanceCode.Length < 7 ? @"\\p{Zs}" : "")
检查是支持7位BallanceCode
所必需的。 If it contains 7 digits or more, we do not check if there is a whitespace after it. 如果它包含7位数或更多,我们不会检查它后面是否有空格。 If the length is less than 7, we check for a space. 如果长度小于7,我们检查空间。
So you just want the start (S...) and end (...EUR) of each line and skip everything in between? 所以你只想要每行的开始(S ...)和结束(... EUR)并跳过它们之间的所有内容?
^([sS]\d+).*?([\d\+]+EUR)$
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.