[英]Assign regex pattern as key to an array
I have an array of regular expressions and am trying to loop through a text document to find the first pattern, assign that as the key to an array then continue through find the second pattern and assign that as the value. 我有一个正则表达式数组,正在尝试遍历文本文档以查找第一个模式,将其指定为数组的键,然后继续查找第二个模式并将其指定为值。 Whenever I come across pattern 1 I want that to always be assigned as a key and all pattern 2 matches that follow until I come across a new key will be assigned to that first key as values.
每当我遇到模式1时,我都希望始终将其分配为键,并且在我遇到新键之前,所有随后的模式2匹配都将作为值分配给该第一个键。
Text document structure: 文字文件结构:
Subject: sometext
Email: someemail@email.com
source: www.google.com www.stackoverflow.com www.reddit.com
So I have an array of expressions: 所以我有一个表达式数组:
$expressions=array(
'email'=>'(\b[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b)',
'url'=>'([A-Za-z][A-Za-z0-9+.-]{1,120}:[A-Za-z0-9/](([A-Za-z0-9$_.+!*,;/?:@&~=-])|%[A-Fa-f0-9]{2}){1,333}(#([a-zA-Z0-9][a-zA-Z0-9$_.+!*,;/?:@&~=%-]{0,1000}))?)'
);
I want to loop through my text document and match the email address then assign that as the key to an array then assign all urls that follow as the values, s the output to the above text would be: 我想遍历我的文本文档并匹配电子邮件地址,然后将其分配为数组的键,然后分配其后的所有url作为值,上述文本的输出将是:
array(
'someemail@email.com' => array (
0 => 'www.google.com',
1 => 'www.stackoverflow.com',
2 => 'www.reddit.com'
)
One way to do such a thing: 一种做这种事情的方法:
$parts = preg_split("/(emailexpr)/",$txt,-1,PREG_SPLIT_DELIM_CAPTURE);
$res = array();
// note: $parts[0] will be everything preceding the first emailexpr match
for ( $i=1; isset($parts[$i]); $i+=2 )
{
$email = $parts[$i];
$chunk = $parts[$i+1];
if ( preg_match_all("/domainexpr/",$chunk,$match) )
{
$res[$email] = $match[0];
}
}
replace emailexpr
and domainexpr
with your regexp gibberish. 用您的regexp乱码替换
emailexpr
和domainexpr
。
I would do: 我会做:
$lines = file('input_file', FILE_SKIP_EMPTY_LINES);
$array = array();
foreach($lines as $line) {
if(preg_match('/^Subject:/', $line) {
$email = '';
} elseif(preg_match('/^Email: (.*)$/', $line, $m)) {
if(preg_match($expressions['email'], $m[1])) {
$email = $m[1];
}
} elseif(preg_match('/^source: (.*)$/', $line, $m) && $email) {
foreach(explode(' ', $m[1]) as $url) {
if(preg_match($expressions['url'], $url)) {
$array[$email][] = $url;
}
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.