正则表达式与几个捕获组

Question

I have a working regex to extract several information. 我有一个工作正则表达式来提取几个信息。 The php code is the folowing: php代码如下：

<?php

$re = "/(\\d{2}.\\d{2}.\\d{2}).+(\\w{3}).+\\w{3}.+(\\w{2}\\s\\d{4}).+(\\d{2}:\\d{2}\\n).+(\\d{2}.\\d{2}.\\d{2}).+(\\w{2}\\s\\d{4}).+(\\d{2}:\\d{2}\\n).+((FNC|PXO)\\d{3})/"; 
$str = "***NEUBUCHUNG ***\n 24.01.15  TXL  FNC  AB 2306  11:40   15:20\n 31.01.15  FNC  TXL  AB 2307  16:05\n FNC044  RESIDENCIAL VILA LUSITANI    9000-120 FUNCHAL\n  1  DOPPELZIMMER                     FRUEHSTUECK\n SPO1101\n INKL. REISELEITUNG UND TRANSFER AB/BIS\n FLUGHAFEN\n F368966  HERR EIDAM, KLAUS               54\n F368966  FRAU EIDAM, SONJA               54"; 

$str2 = "***ÄNDERUNG ***\nNEU:11.04.15 DUS  AB 2646  13:15   16:25\n    18.04.15 FNC  DUS  AB 2647  17:15\n   FNC027    PESTANA CARLTON MADEIRA   9004-531 FUNCHAL\n 1  DO-MEERBLICK                       F\nF365474 HERR   PETERS, HANS                                O 03.01.15\nLANGZEITERMÄSSIGUNG 10%\nSPO-JAN_SALES 20%\nFRÜHBUCHER 10%\nINKL. REISELEITUNG UND TRANSFER AB/BIS\nFLUGHAFEN\nZimmer in ruhiger Lage\n(unverbindlicher Kundenwunsch)\nNEU:\nF365474 FRAU   PETERS, ULRIKE                              O 03.01.15"; 

preg_match($re, $str, $matches);
print_r($matches)
?>

https://ideone.com/UdIaA7 https://ideone.com/UdIaA7

Regex with str: https://regex101.com/r/rF0uP7/5 正则表达式与str： https ： //regex101.com/r/rF0uP7/5

Regex with str2: https://regex101.com/r/cV6iF9/1 正则表达式与str2： https ： //regex101.com/r/cV6iF9/1

However it works perfectly for str it does not match in str2, and I cant find the reason why 然而，它完全适用于Str在str2中不匹配，我找不到原因

Answer 1

However it works perfectly for str it does not match in str2, and I cant find the reason why 然而，它完全适用于Str在str2中不匹配，我找不到原因

Here is the Culprit Expression: (\\\\w{3}).+\\\\w{3} 这是罪魁祸首表达式： (\\\\w{3}).+\\\\w{3}

And in $str you had 24.01.15 TXL FNC AB 在$ str你有24.01.15 TXL FNC AB

But in $str2, you had: 11.04.15 DUS AB 但是在$ str2中，你有： 11.04.15 DUS AB

Your Regex could read better like so: 您的正则表达式可以更好地阅读：

$re = "#(\d{2}\.\d{2}\.\d{2})(?:\s+(\w{3}))?\s+\w{3}\s+(\w{2}\s\d{4}).+(\d{2}:\d{2}\n)\s+(\d{2}\.\d{2}\.\d{2}).+(\w{2}\s\d{4})\s+(\d{2}:\d{2}\n).+((FNC|PXO)\d{3})#si";

Quick-Test . 快速测试。

Answer 2

The .+(\\w{3}) in the beginning must be optional. 开头的.+(\\w{3})必须是可选的。 Wrap it with (?:.+(\\w{3}))? 用(?:.+(\\w{3}))?包裹它(?:.+(\\w{3}))? . 。

See the regex demo 请参阅正则表达式演示

Also, you have too many .+ , in most places, you meant to just match whitespaces, and thus are better turned into \\s+ . 另外，你有太多.+ ，在大多数地方，你只想匹配空格，因此最好转换为\\s+ 。 Also, dots that are meant to match literal dots must be escaped. 此外，必须转义旨在匹配文字点的点。

Use a more optimized: 使用更优化：

(\d{2}\.\d{2}\.\d{2})(?:\s+(\w{3}))?\s+\w{3}\s+(\w{2}\s\d{4}).+(\d{2}:\d{2}\n)\s+(\d{2}\.\d{2}\.\d{2}).+(\w{2}\s\d{4})\s+(\d{2}:\d{2}\n).+((FNC|PXO)\d{3})

See this regex demo 看到这个正则表达式演示

正则表达式与几个捕获组

问题描述

2 个解决方案

解决方案1
2 2016-09-15 13:44:02

解决方案2
1 已采纳 2016-09-15 13:44:28

正则表达式与几个捕获组

问题描述

2 个解决方案

解决方案1 2 2016-09-15 13:44:02

解决方案2 1 已采纳 2016-09-15 13:44:28

解决方案1
2 2016-09-15 13:44:02

解决方案2
1 已采纳 2016-09-15 13:44:28