[英]PHP Preg_match_all on XML/GML output on multiple lines
I try to get match multiple lines of XML/GML output with preg_match_all() from a WFS service. 我尝试通过WFS服务的preg_match_all()获取多行XML / GML输出。 I receive a bunch of data that is available on a public server for everyone to use.
我收到了一堆公共服务器上可供所有人使用的数据。 I tried to use the s and m flag , but with little luck.
我尝试使用s和m标志 ,但是运气不好。 The data I receive looks likes this:
我收到的数据如下所示:
<zwr:resultaat>
<zwr:objectBeginTijd>2012-09-18</zwr:objectBeginTijd>
<zwr:resultaatHistorie>
<zwr:datumInvoeren>2012-10-31</zwr:datumInvoeren>
<zwr:invoerder>
<zwr:voornaam>Joep</zwr:voornaam>
<zwr:achternaam>Koning, de</zwr:achternaam>
<zwr:email>jdekoning@hhdelfland.nl</zwr:email>
<zwr:telefoon>015-2608166</zwr:telefoon>
<zwr:organisatie>
<zwr:bedrijfsnaam>Hoogheemraadschap van Delfland</zwr:bedrijfsnaam>
<zwr:adres>
<zwr:huisnummer>32</zwr:huisnummer>
<zwr:postcode>2611AL</zwr:postcode>
<zwr:straat>Phoenixstraat</zwr:straat>
<zwr:woonplaats>DELFT</zwr:woonplaats>
</zwr:adres>
<zwr:email>info@hhdelfland.nl</zwr:email>
<zwr:telefoon>(015) 260 81 08</zwr:telefoon>
<zwr:website>http://www.hhdelfland.nl/</zwr:website>
</zwr:organisatie>
</zwr:invoerder>
</zwr:resultaatHistorie>
<zwr:risicoNiveau>false</zwr:risicoNiveau>
<zwr:numeriekeWaarde>0.02</zwr:numeriekeWaarde>
<zwr:eenheid>kubieke millimeter per liter</zwr:eenheid>
<zwr:hoedanigheid>niet van toepassing</zwr:hoedanigheid>
<zwr:kwaliteitsOordeel>Normale waarde</zwr:kwaliteitsOordeel>
<zwr:parameterGrootheid>
<zwr:grootheid>Biovolume per volume eenheid</zwr:grootheid>
<zwr:object>Microcystis</zwr:object>
</zwr:parameterGrootheid>
<zwr:analyseProces>
<zwr:analyserendeInstantie>AQUON</zwr:analyserendeInstantie>
</zwr:analyseProces>
</zwr:resultaat>
An example of the data can also be found at: http://212.159.219.98/zwr-ogc/services?SERVICE=WFS&VERSION=1.1.0&REQUEST=GetGmlObject&OUTPUTFORMAT=text%2Fxml%3B+subtype%3Dgml%2F3.1.1&TRAVERSEXLINKDEPTH=0&GMLOBJECTID=ZWR_MONSTERPUNT_304427 还可以在以下位置找到数据示例: http : //212.159.219.98/zwr-ogc/services?SERVICE=WFS&VERSION=1.1.0&REQUEST=GetGmlObject&OUTPUTFORMAT=text%2Fxml%3B+subtype%3Dgml%2F3.1.1&TRAVERSEXLINKDEPTH= 0&GMLOBJECTID = ZWR_MONSTERPUNT_304427
It is all in Dutch but that should not matter for the context of the question. 全部使用荷兰语,但这与问题的上下文无关紧要。 The case is that I would like to search multiple lines of this code and get the values between tags.
这种情况是我想搜索这段代码的多行并获取标签之间的值。 I also tried to read it all out separately (which worked out fine), but because there are multiple combinations of tags (sometimes a tag will be used or not), this mixes up the data I receive and there is no structure in the fetched data.
我也尝试分别读取所有内容(效果很好),但是由于标签有多种组合(有时会使用或不使用标签),因此这会混淆我收到的数据,并且提取的内容没有结构数据。
I thought it would be a good idea to read a whole set of tags so that I can keep the data together. 我认为最好阅读一整套标签,以便将数据保持在一起。 The current preg_match_all() code I have is :
我当前使用的preg_match_all()代码是:
preg_match_all("/<zwr:risicoNiveau>(.*)<\/zwr:risicoNiveau><zwr:numeriekeWaarde>(.*)<\/zwr:numeriekeWaarde><zwr:eenheid>(.*)<\/zwr:eenheid><zwr:hoedanigheid>(.*)<\/zwr:hoedanigheid>
<zwr:kwaliteitsOordeel>(.*)<\/zwr:kwaliteitsOordeel><zwr:parameterGrootheid><zwr:object>(.*)<\/zwr:object><zwr:grootheid>(.*)<\/zwr:grootheid><\/zwr:parameterGrootheid>/m", $content, $stof);
So as you can see I would like to read multiple values from one preg_match_all(), this will give me an array with multiple array's in it. 如您所见,我想从一个preg_match_all()中读取多个值,这将为我提供一个包含多个数组的数组。
How do I read multiple tags after each other (which are on different lines?)? 如何互相读取多个标签(在不同的行上?)? When I use a var_dump() to show all the data, it shows me a multidimensional array with no data in it.
当我使用var_dump()显示所有数据时,它显示了一个没有数据的多维数组。 The s and m flags do not work for me?
s和m标志对我不起作用? Am I doing something wrong?
难道我做错了什么? Other methods in PHP are welcome!
欢迎使用PHP中的其他方法!
1.) You need to add whitespace \\s
in between tags. 1.)您需要在标签之间添加空格
\\s
。
<\\/zwr:risicoNiveau>
\\s*
<zwr:numeriekeWaarde>
... <\\/zwr:risicoNiveau>
\\s*
<zwr:numeriekeWaarde>
...
2.) Further use .*?
2.)进一步使用
.*?
inside your capture groups for matching non greedy . 在捕获组中匹配非贪婪 。
<zwr:risicoNiveau>(.*?)<\\/zwr:risicoNiveau>
3.) Improve regex readability by use of x
flag (free spacing mode). 3.)通过使用
x
标志 (自由间距模式)提高正则表达式的可读性。
Regex demo at regex101 regex101上的Regex演示
Note : Use exclusion ([^<]*?)
rather than (.*?)
for forcing the format like this . 注 :使用排除
([^<]*?)
而不是(.*?)
迫使格式是这样 。 To match the remaining tags, use optional quantifier ?
要匹配其余标签,请使用可选的量词
?
on optional tags like this with optional <zwr:object>
上可选标记像这样带有可选
<zwr:object>
$pattern = '~
<zwr:risicoNiveau>(.*?)</zwr:risicoNiveau>\s*
<zwr:numeriekeWaarde>(.*?)</zwr:numeriekeWaarde>\s*
<zwr:eenheid>(.*?)</zwr:eenheid>\s*
<zwr:hoedanigheid>(.*?)</zwr:hoedanigheid>\s*
<zwr:kwaliteitsOordeel>(.*?)</zwr:kwaliteitsOordeel>\s*
<zwr:parameterGrootheid>\s*
<zwr:grootheid>(.*?)</zwr:grootheid>\s*
<zwr:object>(.*?)</zwr:object>\s*
</zwr:parameterGrootheid>
~sx';
PREG_SET_ORDER Orders results so that
$matches[0]
is an array of first set of matches,$matches[1]
is an array of second set of matches, and so on... read more in the PHP MANUALPREG_SET_ORDER对结果进行排序,以使
$matches[0]
是第一组匹配项的数组,$matches[1]
是第二组匹配项的数组,依此类推... 在PHP手册中了解更多
if(preg_match_all($pattern, $str, $out, PREG_SET_ORDER) > 0)
print_r($out);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.