简体   繁体   中英

how can i retrieve data from nested xml node using php?

I am new in xml and data retrieve and i have problem with this code.

XML code:

<?xml version="1.0" encoding="UTF-8"?>
<site>
    <page>
        <content>
            <P>
                <FONT size="2" face="Tahoma">
                    <STRONG>text...</STRONG>
                </FONT>
            </P>
            <P>
                <FONT size="2" face="Tahoma">text....</FONT>
            </P>

            <P align="center">
                <IMG style="WIDTH: 530px" border="1" alt="" src="http://www.alkul.com/online/2014/5/6/child%20disorder.jpg">
            </P>
            <P>
                <STRONG>
                    <FONT size="2" face="Tahoma">text3</FONT>
                </STRONG>
            </P>
            <P>
                <STRONG>
                    <FONT size="2" face="Tahoma">text1</FONT>
                </STRONG>
            </P>
        </content>
    </page>
</site>

php code:

<?php
$html = "";
$url  = "Data.xml";
$xml  = simplexml_load_file($url);    

for ($i = 0; $i<10; $i++) {     
    $title = $xml->page[$i]->content->P->FONT;
    $html .= "<p>$title</p>";
}

echo $html;

I just need to display the content of content node but the output is empty

First of all, the provided XML is not valid as you should receive the following error:

Warning: simplexml_load_string(): Entity: line 8: parser error : Opening and ending tag mismatch: IMG line 8 and P

In XML the IMG element needs to be closed like this:

<IMG style="WIDTH: 530px" border="1" alt="" src="http://www.alkul.com/online/2014/5/6/child%20disorder.jpg"/>

Note the forward slash at the end of the element.
If you do not see that error, please look in your error log or enable error reporting in PHP.

Now the XML can be parsed by SimpleXML. I ended up with this:

$pList = $xml->xpath('./page/content/P');
foreach ($pList as $pElement) {
    $text = strip_tags($pElement->asXML());
    echo $text . "<br>";
}

It selects all the P elements into $pList and iterates over the list. For each element it takes the XML and strips all tags from it, leaving you with just the "inner text" for each element.

Lastly, I'd suggest you use the PHP Simple HTML DOM Parser as it is quite easy to use and more tailored towards scraping data from HTML.

If you only want to display what is in the content node so here is your code

<?php
$html = "";
$url  = "data.xml";
$xml  = simplexml_load_file($url);

$title = $xml->page->content->asXML();
$html  .= "<p>$title</p>";

echo $html;

You have HTML inside an XML node. This needs XML encoding, normally done with a CDATA block. You then can just use the $xml->page->content element with echo or by casting it to string.

XML (take note of the <![CDATA[ ... ]]> part):

<?xml version="1.0" encoding="UTF-8"?>
<site>
    <page>
        <content><![CDATA[
            <P>
                <FONT size="2" face="Tahoma">
                    <STRONG>text...</STRONG>
                </FONT>
            </P>
            <P>
                <FONT size="2" face="Tahoma">text....</FONT>
            </P>

            <P align="center">
                <IMG style="WIDTH: 530px" border="1" alt="" src="http://www.alkul.com/online/2014/5/6/child%20disorder.jpg">
            </P>
            <P>
                <STRONG>
                    <FONT size="2" face="Tahoma">text3</FONT>
                </STRONG>
            </P>
            <P>
                <STRONG>
                    <FONT size="2" face="Tahoma">text1</FONT>
                </STRONG>
            </P>
        ]]></content>
    </page>
</site>

PHP:

$xml = simplexml_load_file($url);

$firstTenPages = new LimitIterator(new IteratorIterator($xml->page), 0, 10);

foreach ($firstTenPages as $page)
{
    echo $page->content;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM