简体   繁体   English

正则表达式模式与PHP匹配

[英]Regex pattern match with PHP

I have the following data in a file, which is repeated multiple times: 我的文件中包含以下数据,该数据会重复多次:

Date:21 Month:03 Year:2017 Amount:50 Category:Grocery Account:bank Note:expensive 日期:21月份:03年份:2017金额:50类别:杂货账户:银行纸币:昂贵

Now, I want to extract the value after "Amount:", ie "50". 现在,我要提取“金额:”之后的值,即“ 50”。

I'm using the following code in PHP: 我在PHP中使用以下代码:

$result = preg_split("/Amount/", $contents);
$truncated = substr($printresult, 1, 2);
print_r($truncated);

The result I'm getting is this: 我得到的结果是这样的:

    Da50

Could you please help me figure out what exactly I'm doing wrong in this code? 您能帮我弄清楚我在这段代码中到底做错了什么吗?

Thank you. 谢谢。

[Edit: $contents contains all the string data] [编辑:$ contents包含所有字符串数据]

This is the entire code: http://paste.ideaslabs.com/show/hwj7IiPUcd Content of data.txt is this: http://paste.ideaslabs.com/show/5TxWH8MUX 这是完整的代码: http ://paste.ideaslabs.com/show/hwj7IiPUcd data.txt的内容是这样的: http : //paste.ideaslabs.com/show/5TxWH8MUX

you can try this 你可以试试这个

    $subject = "Date:21 Month:03 Year:2017 Amount:50 Category:Grocery Account:bank Note:expensive";

$pattern = "/Account/";

    preg_match($pattern, $subject, $matches);
    print_r($matches);

The da comes from the Date at the start of your string. da来自字符串开头的Date You need to use preg_match or preg_match_all to extract exact matches. 您需要使用preg_matchpreg_match_all提取完全匹配项。 preg_split splits on the found term, index 0 you don't care about. preg_split拆分找到的术语,您不关心索引0。 Try: 尝试:

$arraynext = 'Date:21
Month:03
Year:2017
Amount:50
Category:Wow
Account:The
Note:This';
$endresult = preg_match("/\s*Amount:\s*(\d+)/", $arraynext, $match);
echo $match[1];

Regex demo: https://regex101.com/r/SA48sm/1/ 正则表达式演示: https : //regex101.com/r/SA48sm/1/

PHP Demo: https://3v4l.org/6jaCV PHP演示: https : //3v4l.org/6jaCV

If you say that you have many coincidences, then you need to select all 如果您说有很多巧合,那么您需要选择所有

preg_match_all('/(?<=Amount:)[\d]{0,}/', $contents, $result);
foreach($result as $res) {
    print_r($res);
}

You can use the following regex pattern ... 您可以使用以下正则表达式模式...

(?<=Amount:)\d+

see regex demo 参见正则表达式演示

PHP ( demo ) PHP演示

$regex = '/(?<=Amount:)\d+/';
$arraynext = file_get_contents('data.txt');
preg_match_all($regex, $arraynext, $result);
print_r($result);

Use this pattern: /Amount:\\K\\d+/ 使用以下模式: /Amount:\\K\\d+/
It will accurately extract the full desired numeric value that follows each Amount: without using the much less efficient "lookarounds". 它将准确地提取每个Amount:的全部期望数值Amount:无需使用效率低得多的“环顾四周”。

My web filter software does not let me visit your pastelabs links, so I cannot see your actual input. 我的Web筛选器软件不允许我访问您的pastelabs链接,因此我看不到您的实际输入。 (This is one of many reasons why you should post your input samples directly into your question.) You state that you have several lines which you must extract from, so this is the sample input that I have tested with: (这是您将输入样本直接发布到问题中的众多原因之一。)您声明必须从中提取几行,因此这是我测试过的样本输入:

Date:21 Month:03 Year:2017 Amount:50 Category:Grocery Account:bank Note:expensive
Date:1 Month:04 Year:2017 Amount:150 Category:Grocery Account:bank Note:expensive
Date:14 Month:04 Year:2017 Amount:5 Category:Grocery Account:bank Note:expensive
Date:28 Month:04 Year:2017 Amount:5935 Category:Grocery Account:bank Note:expensive

My pattern captures the desired results in just 48 steps . 我的模式仅用48个步骤即可捕获所需的结果。 ( Pattern Demo ) 模式演示
The pattern uses \\K which means "KEEP the character starting from this point", so there is no need for a capture group, nor a "lookbehind". 该模式使用\\K表示“从这一点开始保留字符”,因此不需要捕获组,也不需要“向后看”。
If your actual input data has optional spaces between Amount: and the number value, then just add ? 如果您的实际输入数据在Amount:和数字值之间具有可选空格,则只需添加? (space then question mark) to the pattern after the : . (空格,然后问号)到:之后的模式。

When used with preg_match_all() , the output array is as small as preg_match_all() can make: an array containing 1 subarray with 4 elements. preg_match_all() ,输出数组的大小与preg_match_all()可以制作的数组一样小:一个包含1个包含4个元素的子数组的数组。 I cut directly to the subarray in my code to follow: 我直接切入代码中的子数组以遵循:

Code: ( Demo ) 代码:( 演示

$in='Date:21 Month:03 Year:2017 Amount:50 Category:Grocery Account:bank Note:expensive
Date:1 Month:04 Year:2017 Amount:150 Category:Grocery Account:bank Note:expensive
Date:14 Month:04 Year:2017 Amount:5 Category:Grocery Account:bank Note:expensive
Date:28 Month:04 Year:2017 Amount:5935 Category:Grocery Account:bank Note:expensive';

var_export(preg_match_all('/Amount:\K\d+/',$in,$out)?$out[0]:[]);

Output: 输出:

array (
  0 => '50',
  1 => '150',
  2 => '5',
  3 => '5935',
)

As far as the other answers on this page, they all process my test data in over 600 steps (more than 12x slower / less efficient than my pattern). 就此页面上的其他答案而言,它们都以600多个步骤处理我的测试数据(比我的模式慢12倍/效率低12倍)。 At the time of this post, one of them is completely wrong, and some use sloppy regex syntax and shouldn't be learned from. 在撰写本文时,其中之一是完全错误的,其中一些使用草率的regex语法,不应从中学习。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM