[英]Regex to find string between two strings, excluding outer strings
I know this has been asked a thousand times before, but I could not get any of the previous solutions working for my case. 我知道这个问题已经被问过一千遍了,但是我无法获得任何适用于我的案例的解决方案。 I'm trying to use Regex in Javascript to parse a text file.
我正在尝试在Javascript中使用Regex解析文本文件。 The bit I'm trying to extract is the monetary figure, with a format like 55,555.00.
我尝试提取的是货币数字,格式为55,555.00。 The numbers of digits here can vary throughout the text file.
此处的位数在整个文本文件中可能有所不同。 Additionally, the boundary characters and spaces can vary.
此外,边界字符和空格可以变化。
I wrote the following to extract what I need from the sample code below: 我编写了以下内容以从下面的示例代码中提取所需的内容:
/((\w\s{10,20})([0-9]{8,}(?=.*[,.]))/g
sample code: 样例代码:
23205 - Grants Current-County Operatin 4,425,327.00"
" 4 0000047387 Central Equatoria State 1003-1478 Sta Hosp Oper Oct 85,784.00"
" 4 0000047442 EASTERN EQUATORIA ST 1003-1479 Sta Hosp Oper Oct 93,137.00"
" 4 0000047485 JONGLEI STATE 1003-1519 Sta Hosp Oper Oct 144,608.00"
" 4 0000047501 Lakes State 1003-1482 Sta Hosp Oper Oct 93,137.00"
" 4 0000047528 Unity State 1003-1484 Sta Hosp Oper Oct 75,980.00"
" 4 0000047532 Northern Bahr-el State 1003-1483 Sta Hosp Oper Oct 58,824.00"
" 4 0000047615 Western E State 1003-1488 Sta Hosp Oper Oct 93,137.00"
" 4 0000047638 Warap State 1003-1486 Sta Hosp Oper Oct 51,471.00"
" 4 0000047680 Upper Nile State 1003-1485 Sta Hosp Oper Oct 102,941.00"
" 4 0000047703 Western BG State 1003-1487 Sta Hosp Oper Oct 34,314.00"
----------------------
" Total For Period 4 833,333.00"
----------------------------------------------------------------------------------------------------------------------------
Fiscal Year 2015/16 Republic Of South Sudan Date 2015/11/20
Period 5 Time 12:58:40
FreeBalance Financial Management System Page 7
----------------------------------------------------------------------------------------------------------------------------
Vendor Analysis Report
1091 Health (MOH)
Prd Voucher # Vendor Name Description Amount
--- ---------------- ------------------------------ ----------------------------- ----------------------
----------------------
"
Here's an example: https://regex101.com/r/nO8nM1/4 这是一个示例: https : //regex101.com/r/nO8nM1/4
The issue is the leading boundary. 问题是主要的边界。 I am able to exclude the closing boundary (double quotes), but I can't get rid of the leading boundary.
我可以排除右边界(双引号),但不能摆脱前导边界。 I've gotten a couple things sort of working, but they included the two strings of digits outside the main tables (in this case 4,425,327.00 and 833,333.00).
我已经做了一些工作,但是它们在主表之外包括了两个数字串(在本例中为4,425,327.00和833,333.00)。
Any help would be much appreciated. 任何帮助将非常感激。
To match float values with obligatory decimal fractions and ,
as a digit grouping symbol, you can use 要将浮点值与必需的小数点和匹配
,
作为数字分组符号,可以使用
\d+(?:,\d{3})*\.\d+
Explanation : 说明 :
\\d+
- 1 or more digits \\d+
-1个或更多数字 (?:,\\d{3})*
- 0 or more sequences of (?:,\\d{3})*
-0个或多个序列
,
- a comma ,
-逗号 \\d{3}
- exactly 3 digits \\d{3}
-正好3位数字 \\.
- a literal period/dot \\d+
- 1 or more digits. \\d+
-1个或多个数字。 To only get the values that appear after Oct
, you may use a regex that is a mix of the pattern above and yours: 要仅获取
Oct
之后出现的值,可以使用将上述模式和您的模式混合使用的正则表达式:
\w\s{10,20}(\d+(?:,\d{3})*\.\d+)
See another demo 观看另一个演示
The \\w\\s{10,20}
matches an alphanumeric \\w
and then 10 to 20 whitespace characters, and only after that the pattern matches and captures into Group 1 the float value. \\w\\s{10,20}
匹配一个字母数字\\w
,然后匹配10到20个空格字符,只有在该模式匹配并将 float值捕获到组1中之后,该字符才匹配。
See JS snippet below ( m[1]
is where the float value resides): 请参见下面的JS代码段(
m[1]
是float值所在的位置):
var re = /\\w\\s{10,20}(\\d+(?:,\\d{3})*\\.\\d+)/gm; var str = ' 23205 - Grants Current-County Operatin 4,425,327.00"\\n\\n" 4 0000047387 Central Equatoria State 1003-1478 Sta Hosp Oper Oct 85,784.00"\\n" 4 0000047442 EASTERN EQUATORIA ST 1003-1479 Sta Hosp Oper Oct 93,137.00"\\n" 4 0000047485 JONGLEI STATE 1003-1519 Sta Hosp Oper Oct 144,608.00"\\n" 4 0000047501 Lakes State 1003-1482 Sta Hosp Oper Oct 93,137.00"\\n" 4 0000047528 Unity State 1003-1484 Sta Hosp Oper Oct 75,980.00"\\n" 4 0000047532 Northern Bahr-el State 1003-1483 Sta Hosp Oper Oct 58,824.00"\\n" 4 0000047615 Western E State 1003-1488 Sta Hosp Oper Oct 93,137.00"\\n" 4 0000047638 Warap State 1003-1486 Sta Hosp Oper Oct 51,471.00"\\n" 4 0000047680 Upper Nile State 1003-1485 Sta Hosp Oper Oct 102,941.00"\\n" 4 0000047703 Western BG State 1003-1487 Sta Hosp Oper Oct 34,314.00"\\n ----------------------\\n" Total For Period 4 833,333.00"\\n ----------------------------------------------------------------------------------------------------------------------------\\n Fiscal Year 2015/16 Republic Of South Sudan Date 2015/11/20\\n Period 5 Time 12:58:40\\n FreeBalance Financial Management System Page 7\\n ----------------------------------------------------------------------------------------------------------------------------\\n Vendor Analysis Report\\n\\n 1091 Health (MOH)\\n Prd Voucher # Vendor Name Description Amount\\n --- ---------------- ------------------------------ ----------------------------- ----------------------\\n ----------------------\\n" '; var m; while ((m = re.exec(str)) !== null) { document.getElementById("r").innerHTML += m[1] + "<br/>"; }
<div id="r"/>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.