简体   繁体   English

匹配字符串中的两个正则表达式

[英]Match Two Regular Expressions in String

I have this text "£24,250.00 (inc. VAT)" 我收到此文字“ 24,250.00英镑(含增值税)”

I want a regex that will show ONLY "24250.00" 我想要一个仅显示“ 24250.00”的正则表达式

I've managed to get the last portion with: 我设法获得了最后一部分:

( \(inc\. VAT\))

And separately I can get the £ and , with: 另外,我可以通过以下方式获得£和:

[£,]

But I can't seem to work out how to combine both expressions to just return what I want. 但是我似乎无法弄清楚如何组合两个表达式以返回我想要的东西。

Note that the number is dynamic so will change depending on applicable costs on a website. 请注意,该数字是动态的,因此将根据网站上的适用费用而变化。

In theory I could just run it through two separate regex in my c# code each one trimming what I want. 从理论上讲,我可以在我的C#代码中通过两个单独的正则表达式运行它,每个正则表达式都会修剪我想要的东西。 But is there a way that it can be done with just one expression? 但是,有没有一种方法可以只用一个表达式来完成呢?

Reason for this is I have a GetConvertedExtension method that takes an IWebElement, a string (the regex) and then converts the string to Double, Int etc 原因是我有一个GetConvertedExtension方法,该方法采用IWebElement,字符串(正则表达式),然后将字符串转换为Double,Int等。

I don't really want to change this extension method or avoid using and going down the root of multiple expressions and then a parse statement. 我真的不想更改此扩展方法,也不必避免使用多个表达式的根然后使用parse语句。

I've used https://regexr.com/ to try getting a working solution but with no luck and starting to struggle. 我已经使用https://regexr.com/尝试获得有效的解决方案,但是没有运气,开始挣扎。

I'm using Visual Studio 2017 and C# with the Regex library 我正在将Visual Studio 2017和C#与正则表达式库一起使用

If you want to use a single regex, you could use 2 capturing groups: 如果要使用单个正则表达式,则可以使用2个捕获组:

£(\d+),(\d+\.\d+) \(inc\. VAT\)

Then you could use group1 and group2 after each other to get your value. 然后,您可以互相使用group1和group2来获取价值。

If the decimal part after the dot can contain only 2 digits, replace the last \\d+ with \\d{2} 如果点后的小数部分只能包含2位数字,请用\\d{2}替换最后一个\\d+

For example: 例如:

string pattern = @"£(\d+),(\d+\.\d+) \(inc\. VAT\)";
string input = @"£24,250.00 (inc. VAT)";

foreach (Match m in Regex.Matches(input, pattern))
{
    Console.WriteLine(m.Groups[1].Value + m.Groups[2].Value);
}

Result 结果

24250.00

See a .NET regex demo | 查看.NET regex演示 | C# Demo C#示范

(?<currency>[£$€])(?<value>[0-9]{1,3}(?:,[0-9]{3})*\\.[0-9]{2})\\s\\(inc\\.\\sVAT\\)

I would use sometihng like this. 我会像这样使用sometihng。 I added the first capture group with currency just as I thought maybe this could be useful too? 我添加了第一个带有货币的捕获组,就像我认为这可能也有用吗? You'd just have to add which currency symbols you are interested in the square brackets. 您只需要添加对方括号感兴趣的货币符号。

In visual studio you: 在Visual Studio中,您:

var regex = new Regex(@"(?<currency>[£$€])(?<value>[0-9]{1,3}(?:,[0-9]{3})*\\.[0-9]{2})\\s\\(inc\\.\\sVAT\\)");

Then you do your regex.Match(data) or regex.Matches(data) or whatever you need to do. 然后,您可以执行regex.Match(data)或regex.Matches(data)或任何您需要做的事情。

Then to access the number in your match you need to access the value group so... match.Groups["value"].Value where match is what you've assigned to be your regex match. 然后,要访问匹配项中的数字,您需要访问值组,以便... match.Groups["value"].Value其中match是分配给您的正则表达式匹配项的值。

Just to quickly run through the regex: 只是为了快速运行正则表达式:

(?<currency>[£$€]) this is a named capture group which will capture £ or $ or literally. (?<currency>[£$€])这是一个命名捕获组,它将逐字捕获£$

(?<value>[0-9]{1,3}(?:,[0-9]{3})*\\.[0-9]{2}) This is named capture group to get the number. (?<value>[0-9]{1,3}(?:,[0-9]{3})*\\.[0-9]{2})这是为获取编号而命名的捕获组。 Further breaking it down: 进一步细分:

[0-9]{1,3} matches a digit from 0 to 9 between 1 and 3 (inclusive) times. [0-9]{1,3}与1到3(包括3次)之间的0到9的数字匹配。
(?:,[0-9]{3})* matches the thousands seperated by commas 0 or more times. (?:,[0-9]{3})*匹配以逗号分隔0或更多次的千位。
\\.[0-9]{2} matches the decimal point and two digits after. \\.[0-9]{2}匹配小数点和后两位。

\\s\\(inc\\.\\sVAT\\) This matches literally the inc VAT bit after number. \\s\\(inc\\.\\sVAT\\)这实际上与数字后面的inc VAT位匹配。 Using \\s instead of 使用\\s代替 as whitespace because I find it easier to read. 作为空格,因为我发现它更易于阅读。

NOTE: this regex only works for this number format with a comma for every thousand and always includes the decimal. 注意:此正则表达式仅适用于此数字格式,每千个逗号带有一个逗号,并且始终包含小数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM