使用PHP正则表达式解析XML

Question

How can I use a regular expression to parse XML? 如何使用正则表达式解析XML？

Let's suppose we have the following: 假设我们有以下内容：

$string = '<z>1a<z>2b</z>3c<z>4d</z>5e</z>';
preg_match_all('/<z>(.+)<\/z>/', $string, $result_a);
preg_match_all('/<z>(.+)<\/z>/U', $string, $result_b);
preg_match_all($regex, $string, $result_x);

If I run that, then $result_a will have the string (among the items of the array): 如果运行该命令，则$ result_a将具有字符串（在数组的各项中）：

'1a<z>2b</z>3c<z>4d</z>5e'

In addition, variable $result_b will have the strings (among the items of the array): 另外，变量$ result_b将具有字符串（在数组的各项中）：

'1a<z>2b'
'4d'

Now, I want $result_x to have '2b' and '4d' separately, among the items of the array. 现在，我希望$result_x在数组的各个项目中分别具有“ 2b”和“ 4d”。

What should $regex look like? $regex应该是什么样？

Thanks in advance!!! 提前致谢！！！

Answer 1

Use non-greedy quantifier: 使用非贪婪量词：

'/<z>(.+?)<\/z>/'
     ___^

or change the dot by a negative character class: 或通过负字符类更改点：

'/<z>([^z]+)<\/z>/'

or 要么

'/<z>([^<>]+?)<\/z>/'

or, much more convenient, use a xml parser 或者，更方便的是使用xml解析器

Answer 2

In this case you can either use a non-greedy quantifier or you can use this alternative regex: 在这种情况下，您可以使用非贪婪量词 ，也可以使用以下替代正则表达式：

'/<z>([^<]+)<\/z>/'

[^<] captures all characters except < . [^<]捕获除 < 之外的所有字符。

使用PHP正则表达式解析XML

问题描述

2 个解决方案

解决方案1
3 2011-09-16 11:23:47

解决方案2
3 2011-09-16 11:30:43

使用PHP正则表达式解析XML

问题描述

2 个解决方案

解决方案1 3 2011-09-16 11:23:47

解决方案2 3 2011-09-16 11:30:43

解决方案1
3 2011-09-16 11:23:47

解决方案2
3 2011-09-16 11:30:43