[英]Regex to pick a part of a word
I have a text like this: 我有这样的文字:
my text has $1 per Lap to someone.
Could anyone tell me how to pick the per
part from it. 谁能告诉我如何从中挑选per
部分。 I know how to pick the $
amount. 我知道如何选择$
金额。 It's like this: 就像这样:
new Regex(@"\$\d+(?:\.\d+)?").Match(s.Comment1).Groups[0].ToString()
Any help would be highly appreciated. 任何帮助将受到高度赞赏。
In case you have multiple substrings you need inside a larger string, you can use capturing groups. 如果您需要在更大的字符串中使用多个子字符串,则可以使用捕获组。
To obtain the per
part, use the following regex and grab the Groups[2].Value
: 要获取per
部分,请使用以下正则表达式并获取Groups[2].Value
:
var str = "my text has $1 per Lap to someone. ";
var per_str = new Regex(@"(\$\d+(?:\.\d+)?)\s*(\p{L}+)").Match(str).Groups[2].Value;
Output: 输出:
The regex to capture per
is \\p{L}+
where \\p{L}
captures all Unicode letters (eg ф
, ё
), not just Latin script. 捕获正则表达式per
被\\p{L}+
地方\\p{L}
捕获所有Unicode字母(例如ф
, ё
),而不只是拉丁字母。
To get the number part, use the same regex, but grab Groups[1].Value
: 要获取数字部分,请使用相同的正则表达式,但抓取Groups[1].Value
:
var num_str = new Regex(@"(\$\d+(?:\.\d+)?)\s*(\p{L}+)").Match(str).Groups[1].Value;
Output: 输出:
And another tip : compile your regex first if you plan to use it multiple times during your app execution: 另一个提示 :如果您计划在应用程序执行期间多次使用它,请先编译正则表达式 :
var rx = new Regex(@"(\$\d+(?:\.\d+)?)\s*(\p{L}+)", RegexOptions.Compiled);
var per_str = rx.Match(str).Groups[2].Value;
var num_str = rx.Match(str).Groups[1].Value;
In case you need just a number after $
, just put the opening round bracket after it in the regex: @"\\$(\\d+(?:\\.\\d+)?)\\s*(\\p{L}+)"
. 如果你只需要一个$
之后的数字,只需将它后面的开头圆括号放在正则表达式中: @"\\$(\\d+(?:\\.\\d+)?)\\s*(\\p{L}+)"
。
And to get all groups in 1 go, you can use 为了让所有团体都能参与进来,你可以使用
var groups = rx.Matches(str).Cast<Match>().Select(p => new { num = p.Groups[1].Value, per = p.Groups[2].Value }).ToList();
EDIT: 编辑:
If you just want to match per
after the number, you can use @"(\\$\\d+(?:\\.\\d+)?)\\s*(per)"
or (case-insensitive) @"(\\$\\d+(?:\\.\\d+)?)\\s*((?i:per\\b))"
如果你只是想匹配per
号码后,您可以使用@"(\\$\\d+(?:\\.\\d+)?)\\s*(per)"
或(不区分大小写) @"(\\$\\d+(?:\\.\\d+)?)\\s*((?i:per\\b))"
As you said that per
is a string type the following simple regex can does the job for you : 正如你所说per
是一个字符串类型,以下简单的正则表达式可以为你完成这项工作:
\$\d+\s([a-zA-Z]+)
But if the per
is contain digits you can use \\w
that match word characters : 但如果per
包含数字,则可以使用\\w
匹配单词字符:
\$\d+\s(\w+)
Note that in this case per
is in the first capture group and you need to extract the first group. 请注意 ,在这种情况下, per
位于第一个捕获组中,您需要提取第一个组。
Also you can use a positive look behind if you dont want to use grouping
: 如果你不想使用grouping
你也可以使用积极的外观:
(?<=\$\d+\s)[a-zA-Z]+
If the per
is a special word you can check with following regex : 如果per
是一个特殊单词,您可以使用以下正则表达式进行检查:
(?<=\$\d+\s)per
Something like : 就像是 :
var per_str = new Regex(@'(?<=\$\d+\s)per').Match(str).Groups[0].Value;
if (per_str != ''){
#dostuff
}
(?<=\$\d+(?:\.\d+)?\s+)\S+
这应该为你做。
As @Sayse said, you don't need a Regex here. 正如@Sayse所说,你在这里不需要正则表达式。 I made two solutions without. 我没有做过两个解决方案。
Check the Demo or read the code : 检查演示或阅读代码:
public static void Main()
{
var s = "my text has $1 per Lap to someone.";
Console.WriteLine(Test(s));
Console.WriteLine(Test2(s));
}
static object Test(string s)
{
var tab = s.Remove(s.IndexOf(" Lap")) // remove everything after " Lap"
.Substring(s.IndexOf(" $") + 2) // remove everything before " $"
.Split(' ');
return new { Amount = tab[0], Per = tab[1] };
}
static object Test2(string s)
{
var tab = s.Split(' ');
var amount = tab.Single(t => t.StartsWith("$")).Substring(1);
var per = tab[Array.FindIndex(tab, t => t.StartsWith("$")) + 1];
return new { Amount = amount, Per = per };
}
output 产量
{ Amount = 1, Per = per }
{ Amount = 1, Per = per }
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.