简体   繁体   English

正则表达式选择一个单词的一部分

[英]Regex to pick a part of a word

I have a text like this: 我有这样的文字:

my text has $1 per Lap to someone. 

Could anyone tell me how to pick the per part from it. 谁能告诉我如何从中挑选per部分。 I know how to pick the $ amount. 我知道如何选择$金额。 It's like this: 就像这样:

new Regex(@"\$\d+(?:\.\d+)?").Match(s.Comment1).Groups[0].ToString()

Any help would be highly appreciated. 任何帮助将受到高度赞赏。

In case you have multiple substrings you need inside a larger string, you can use capturing groups. 如果您需要在更大的字符串中使用多个子字符串,则可以使用捕获组。

To obtain the per part, use the following regex and grab the Groups[2].Value : 要获取per部分,请使用以下正则表达式并获取Groups[2].Value

var str = "my text has $1 per Lap to someone. ";
var per_str = new Regex(@"(\$\d+(?:\.\d+)?)\s*(\p{L}+)").Match(str).Groups[2].Value;

Output: 输出:

在此输入图像描述

The regex to capture per is \\p{L}+ where \\p{L} captures all Unicode letters (eg ф , ё ), not just Latin script. 捕获正则表达式per\\p{L}+地方\\p{L}捕获所有Unicode字母(例如фё ),而不只是拉丁字母。

To get the number part, use the same regex, but grab Groups[1].Value : 要获取数字部分,请使用相同的正则表达式,但抓取Groups[1].Value

var num_str = new Regex(@"(\$\d+(?:\.\d+)?)\s*(\p{L}+)").Match(str).Groups[1].Value;

Output: 输出:

在此输入图像描述

And another tip : compile your regex first if you plan to use it multiple times during your app execution: 另一个提示 :如果您计划在应用程序执行期间多次使用它,请先编译正则表达式

var rx = new Regex(@"(\$\d+(?:\.\d+)?)\s*(\p{L}+)", RegexOptions.Compiled);
var per_str = rx.Match(str).Groups[2].Value;
var num_str = rx.Match(str).Groups[1].Value;

In case you need just a number after $ , just put the opening round bracket after it in the regex: @"\\$(\\d+(?:\\.\\d+)?)\\s*(\\p{L}+)" . 如果你只需要一个$之后的数字,只需将它后面的开头圆括号放在正则表达式中: @"\\$(\\d+(?:\\.\\d+)?)\\s*(\\p{L}+)"

And to get all groups in 1 go, you can use 为了让所有团体都能参与进来,你可以使用

var groups = rx.Matches(str).Cast<Match>().Select(p => new { num = p.Groups[1].Value, per = p.Groups[2].Value }).ToList();

在此输入图像描述

EDIT: 编辑:

If you just want to match per after the number, you can use @"(\\$\\d+(?:\\.\\d+)?)\\s*(per)" or (case-insensitive) @"(\\$\\d+(?:\\.\\d+)?)\\s*((?i:per\\b))" 如果你只是想匹配per号码后,您可以使用@"(\\$\\d+(?:\\.\\d+)?)\\s*(per)"或(不区分大小写) @"(\\$\\d+(?:\\.\\d+)?)\\s*((?i:per\\b))"

As you said that per is a string type the following simple regex can does the job for you : 正如你所说per是一个字符串类型,以下简单的正则表达式可以为你完成这项工作:

\$\d+\s([a-zA-Z]+)

But if the per is contain digits you can use \\w that match word characters : 但如果per包含数字,则可以使用\\w匹配单词字符:

\$\d+\s(\w+)

Demo 演示

Note that in this case per is in the first capture group and you need to extract the first group. 请注意 ,在这种情况下, per位于第一个捕获组中,您需要提取第一个组。

Also you can use a positive look behind if you dont want to use grouping : 如果你不想使用grouping你也可以使用积极的外观:

(?<=\$\d+\s)[a-zA-Z]+

If the per is a special word you can check with following regex : 如果per是一个特殊单词,您可以使用以下正则表达式进行检查:

(?<=\$\d+\s)per

Something like : 就像是 :

var per_str = new Regex(@'(?<=\$\d+\s)per').Match(str).Groups[0].Value;
if (per_str != ''){
#dostuff
}
(?<=\$\d+(?:\.\d+)?\s+)\S+

这应该为你做。

As @Sayse said, you don't need a Regex here. 正如@Sayse所说,你在这里不需要正则表达式。 I made two solutions without. 我没有做过两个解决方案。

Check the Demo or read the code : 检查演示或阅读代码:

public static void Main()
{
    var s = "my text has $1 per Lap to someone.";

    Console.WriteLine(Test(s));
    Console.WriteLine(Test2(s));
}

static object Test(string s)
{           
    var tab = s.Remove(s.IndexOf(" Lap"))       // remove everything after " Lap" 
               .Substring(s.IndexOf(" $") + 2)  // remove everything before " $"
               .Split(' ');

    return new { Amount = tab[0], Per = tab[1] };
}

static object Test2(string s)
{
    var tab = s.Split(' ');
    var amount = tab.Single(t => t.StartsWith("$")).Substring(1);
    var per = tab[Array.FindIndex(tab, t => t.StartsWith("$")) + 1];

    return new { Amount = amount, Per = per };
}

output 产量

{ Amount = 1, Per = per }
{ Amount = 1, Per = per }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM