I have a json and I need to match all "text" keys as well as the "html" keys.
For example, the json could be like below:
[{
"layout":12,
"text":"Lorem",
"html":"<div>Ipsum</div>"
}]
Or it could be like below:
[{
"layout":12,
"settings":{
"text":"Lorem",
"atts":{
"html":"<div>Ipsum</div>"
}
}
}]
The json is not always using the same structure so I have to match the keys and get their values using preg_match_all
. I have tried the following to get the value of the "text" key:
preg_match_all('|"text":"([^"]*)"|',$json,$match_txt,PREG_SET_ORDER);
The above works fine for matching a single key. When it comes to matching a second key ("html" in this case) it just doesn't work. I have tried the following:
preg_match_all('|"text|html":"([^"]*)"|',$json,$match_txt,PREG_SET_ORDER);
Can you please give me some hints why the OR operator (text|html) doesn't work? Strangely, the above (multi-pattern) regex works fine when I test it in an online tester but it doesn't work in my php files.
text|html
You should add text|html
to a group, otherwise it will look for "text
or html"
.
|"(text|html)":"([^"]*)"|
This won't currently work with your delimiters though as you use the pipe ( |
) inside of the expression. You should change your delimiters to something else, here I've used /
.
/"(text|html)":"([^"]*)"/
If you still want to use the pipe as your delimiters, you should escape the pipe within the expression.
|"(text\|html)":"([^"]*)"|
If you don't want to manually escape it, preg_quote() can do it for you.
$exp = preg_quote('"(text|html)":"([^"]*)"');
preg_match_all("|{$exp}|",$json,$match_txt,PREG_SET_ORDER);
Although that regex will work, it will need additional parsing and it makes more sense to use a recursive function for this.
json_decode() will decode a JSON string into the relative data types. In the example below I've passed an additional argument true
which means I will get an associative array
where you would normally get an object
.
Once findKeyData()
is called, it will recursively call itself and work through all of the data until it finds the specified key. If not, it returns null
.
function findKeyData($data, $key) {
foreach ($data as $k => $v) {
if (is_array($v)) {
$data = findKeyData($v, $key);
if (! is_null($data)) {
return $data;
}
}
if ($k == $key) {
return $v;
}
}
return null;
}
$json1 = json_decode('[{
"layout":12,
"text":"Lorem",
"html":"<div>Ipsum</div>"
}]', true);
$json2 = json_decode('[{
"layout":12,
"settings":{
"text":"Lorem",
"atts":{
"html":"<div>Ipsum</div>"
}
}
}]', true);
var_dump(findKeyData($json1, 'text')); // Lorem
var_dump(findKeyData($json1, 'html')); // <div>Ipsum</div>
var_dump(findKeyData($json2, 'text')); // Lorem
var_dump(findKeyData($json2, 'html')); // <div>Ipsum</div>
preg_match_all('/"(?:text|html)":"([^"]*)"/',$json,$match_txt,PREG_SET_ORDER);
print $match_txt[0][0]." with group 1: ".$match_txt[0][1]."\n";
print $match_txt[1][0]." with group 1: ".$match_txt[1][1]."\n";
returns:
$ php -f test.php
"text":"Lorem" with group 1: Lorem
"html":"<div>Ipsum</div>" with group 1: <div>Ipsum</div>
The enclosing parentheses are needed : (?:text|html)
; I couldn't get it to work on https://regex101.com without. ?:
means the content of the parentheses will not be captured (ie, not available in the results).
I also replaced the pipe ( |
) delimiter with forward slashes since you also have a pipe inside the regex. Another option is to escape the pipe inside the regex: |"(?:text\\|html)":"([^"]*)"|
.
I don't see any reason to use a regex to parse a valid json string:
array_walk_recursive(json_decode($json, true), function ($v, $k) {
if ( in_array($k, ['text', 'html']) )
echo "$k -> $v\n";
});
You use the Pipe |
character as delimiter, I think this will break your regexp. Does it work using another delimiter like
preg_match_all('#"text|html":"([^"]*)"#',$json,$match_txt,PREG_SET_ORDER);
?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.