简体   繁体   English

simplexml php get xml

[英]simplexml php get xml

If I have a document like this: 如果我有这样的文件:

<!-- in doc.xml -->
<a>
  <b>
    greetings?
    <c>hello</c>
    <d>goodbye</c>
  </b>
</a>

Is there any way to use simplexml (or any php builtin really) to get a string containing: 有没有办法使用simplexml(或任何PHP内置的)来获取包含以下内容的字符串:

greetings?
<c>hello</c>
<d>goodbye</d>

Whitespace and such doesn't matter. 空白等并不重要。

Thanks! 谢谢!

I must admit this wasn't as simple as one would think. 我必须承认,这并不像人们想象的那么简单。 This is what I came up with: 这就是我想出的:

$xml = new DOMDocument;
$xml->load('doc.xml');

// find just the <b> node(s)
$xpath = new DOMXPath($xml);
$results = $xpath->query('/a/b');

// get entire <b> node as text
$node = $results->item(0);
$text = $xml->saveXML($node);

// remove encapsulating <b></b> tags
$text = preg_replace('#^<b>#', '', $text);
$text = preg_replace('#</b>$#', '', $text);

echo $text;

Regarding the XPath query 关于XPath查询

The query returns all matching nodes, so if there are multiple matching <b> tags, you can loop through $results to get them all. 查询返回所有匹配的节点,因此如果有多个匹配的<b>标记,则可以循环遍历$results以获取所有这些标记。

My query for '/a/b' assumes that <a> is the root and <b> is its child/immediate descendant. 我对'/a/b'查询假定<a>是根, <b>是其子/直系后代。 You could alter it for different scenarios. 您可以针对不同场景更改它。 Here's an XPath reference . 这是一个XPath参考 Some adjustments might include: 一些调整可能包括:

  • 'a/b' –– <b> is child of <a> , but <a> is anywhere, not just in the root 'a/b' - <b><a>孩子,但<a>是在任何地方,而不仅仅是在根目录中
  • 'a//b' –– <b> is a descendant of <a> no matter how deep, not just a direct child 'a//b' - <b><a>的后代,无论多深,不只是一个直接的孩子
  • '//b' –– all <b> nodes anywhere in the document '//b' - 文档中任何位置的所有<b>节点

Regarding method of obtaining string contents 关于获得字符串内容的方法

I tried using $node->nodeValue or $node->textContent , but both of them strip out the <c> and <d> tags, leaving just the text contents of those. 我尝试使用$node->nodeValue$node->textContent ,但它们都删除了<c><d>标记,只留下了那些文本内容。 I also tried casting it as a DOMText object, but that didn't directly work and was more trouble than it was worth. 我也尝试将其作为DOMText对象进行投射,但这并没有直接起作用,而且比它的价值更麻烦。

Regarding the use of regular expressions 关于正则表达式的使用

It could be done without regex, but I found it easiest to use them. 它可以在没有正则表达式的情况下完成,但我发现使用它们最简单。 I wanted to make sure that I only stripped the <b> and </b> at the very beginning and end of the string, just in case there were other <b> nodes within the contents. 我想确保我只在字符串的开头和结尾处剥离<b></b> ,以防万一内容中有其他<b>节点。

How about this? 这个怎么样? Since you already know the XML format: 由于您已经知道XML格式:

<?php
$xml = simplexml_load_file('doc.xml'); 
$str = $xml->b;
$str .= "<c>".$xml->b->c."</c>";
$str .= "<d>".$xml->b->d."</d>";

echo $str;
?>

Here's an alternative using DOM (to balance the SimpleXML answers!) that outputs the contents of all of the first <b> element. 这是使用DOM(平衡SimpleXML答案!)的替代方案,它输出所有第一个<b>元素的内容。

$doc = new DOMDocument;
$doc->load('doc.xml');
$bee = $doc->getElementsByTagName('b')->item(0);

$innerxml = '';
foreach ($bee->childNodes as $node) {
    $innerxml .= $doc->saveXML($node);
}
echo $innerxml;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM