[英]Improving a working regex to match multiple lines
I'm trying to match users from an old DOS dump so they can be migrated to something new.我正在尝试从旧的 DOS 转储中匹配用户,以便他们可以迁移到新的东西。 They begin with a
%
sign and end with a ]
.它们以
%
符号开头并以]
结尾。 Some on one line and others across many lines.有些在一条线上,有些在多条线上。
https://regex101.com/r/0h5ndW/1 https://regex101.com/r/0h5ndW/1
My Regex %([^\%]*)]
works, but is there a better way to select each user beginning from %
to the ]
(including the %
and ]
) so I can put them through preg_replace
and manipulate them later?我的正则表达式
%([^\%]*)]
有效,但是有没有更好的方法来 select 每个用户从%
到]
(包括%
和]
)所以我可以将它们通过preg_replace
并稍后操作它们?
I'm a little skeptical about the multi line part.我对多线部分有点怀疑。
Expected Output
%user:100 [ type=admin, added=10/12/1997, last-login:10/20/1997, total-logins:45, status:1 ]
%user:111 [ type=user, added=10/12/1997, last-login:10/27/1997, total-logins:145, status:1 ]
%user:112 [ type=viewer, added=10/12/1997, last-login:10/23/1997, total-logins:6, status:1 ]
%user:113 [ type=viewer, added=10/12/1997, last-login:10/14/1997, total-logins:2, status:1]
%user:114 [ type=viewer, added=10/12/1997, last-login:10/14/1997, total-logins:1, status:1]
%user:115 [ type=viewer, added=10/12/1997, last-login:10/12/1997, total-logins:1, status:1 ]
Raw Data原始数据
%user:100 [
type=admin,
added=10/12/1997,
last-login:10/20/1997,
total-logins:45,
status:1
]
%user:111 [
type=user,
added=10/12/1997,
last-login:10/27/1997,
total-logins:145,
status:1
]
%user:112 [ type=viewer, added=10/12/1997,
last-login:10/23/1997,
total-logins:6,
status:1
]
%user:113 [ type=viewer, added=10/12/1997, last-login:10/14/1997, total-logins:2, status:1]
%user:114 [ type=viewer, added=10/12/1997, last-login:10/14/1997, total-logins:1,
status:1]
%user:115 [ type=viewer, added=10/12/1997, last-login:10/12/1997, total-logins:1,
status:1
]
You can use this regex for search:您可以使用此正则表达式进行搜索:
((?:^%|(?!\A)\G).*)\R(?=[^][]*])
and replace it with:并将其替换为:
$1
Updated RegEx Demo更新的 RegEx 演示
PHP Code: PHP 代码:
$repl = preg_replace('/((?:^%|(?!\A)\G).*)\R(?=[^][]*])/m', '$1', $str);
RegEx Details:正则表达式详细信息:
(
: Start capture group #1 (
: 开始捕获组 #1
(?:^%|(?!\A)\G)
: Match %
at line start or restart matching from end of previous match. (?:^%|(?!\A)\G)
:在行开头匹配%
或从前一个匹配的结尾重新开始匹配。 \G
asserts position at the end of the previous match or the start of the string for the first match. \G
断言 position 在前一个匹配的结尾或第一个匹配的字符串的开头。.*
: Match everything in same line .*
:匹配同一行中的所有内容)
: End capture group #1 )
: 结束捕获组 #1\R
: Match any kind of newline character \R
: 匹配任何类型的换行符(?=[^][]*])
: Make sure we have a ]
ahead without matching [
or ]
in between. (?=[^][]*])
:确保前面有一个]
,中间没有匹配[
或]
。Another option is to use a variant of the pattern that you tried with a negated character class to match %
and from an opening [
till closing ]
.另一种选择是使用您尝试使用否定字符 class 的模式的变体来匹配
%
并从开始[
直到结束]
。
Then per match remove the newlines.然后每场比赛删除换行符。
^%[^][]*\[[^][]*]$
Explanation解释
^
Start of string ^
字符串开头%[^][]*
Match %
and 0+ times any char other than [
or ]
%[^][]*
匹配除[
或]
以外的任何字符的%
和 0+ 次\[[^][]*]
Match from [
till the closing ]
\[[^][]*]
匹配从[
直到结束]
$
Assert end of string $
断言字符串结束Regex demo |正则表达式演示| Php demo
Php 演示
For example例如
$result = preg_replace_callback("/^%[^][]*\[[^][]*]$/m", function($m) {
return str_replace(PHP_EOL, "", $m[0]);
}, $data);
As an alternative to regex, this just splits the data using the ]
.作为正则表达式的替代方法,这只是使用
]
拆分数据。 Then trims each line and replaces new lines (using PHP_EOL
) with a space...然后修剪每一行并用空格替换新行(使用
PHP_EOL
)......
$output = explode("]", $data);
array_pop($output);
array_walk($output, function(&$data) {
$data = str_replace(PHP_EOL, " ", trim($data)."]");
});
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.