how to read only part of an xml file with php xmlreader

Question

I have an RSS xml file that is pretty large, with more than 700 nodes. I am using XMLReader Iterator library to parse it and display the results as 10 per page.

This is my sample code for parsing xml:

<?php
require('xmlreader-iterators.php');

$xmlFile = 'http://www.example.com/rss.xml';
$reader = new XMLReader();
$reader->open($xmlFile);

$itemIterator = new XMLElementIterator($reader, 'item');
$items = array();

foreach ($itemIterator as $item) {
    $xml     = $item->asSimpleXML();
    $items[] = array(
        'title'     => (string)$xml->title,
        'link'      => (string)$xml->link
    );
}

// Logic for displaying the array values, based on the current page. 
// page = 1 means $items[0] to $items[9]

for($i = 0; $i <= 9; $i++)
{       
    echo '<a href="'.$items[$i]['link'].'">'.$items[$i]['title'].'</a><br>';      
}
?>

But the problem is that, for every page, i am parsing the entire xml file and then just displaying the corresponding page results, like: if the page is 1, displaying the 1 to 10 nodes, and if the page is 5, displaying 41 to 50 nodes.

It is causing delay in displaying data. Is it possible to read just the nodes corresponding to the requested page? So for the first page, i can read nodes from 1 to 10 positions, instead of parsing all the xml file and then display first 10 nodes. In other words, can i apply a limit while parsing an xml file?

I came across this answer of Gordon that addresses a similar question, but it is using SimpleXML, which is not recommended for parsing large xml files.

Answer 1

use array_splice to extract the portion of array

require ('xmlreader-iterators.php');

$xmlFile = 'http://www.example.com/rss.xml';
$reader = new XMLReader();
$reader->open($xmlFile);

$itemIterator = new XMLElementIterator($reader, 'item');
$items = array();

$curr_page = (0 === (int) $_GET['page']) ? 1 : $_GET['page'];

$pages = 0;

$max = 10;

foreach ($itemIterator as $item) {
   $xml = $item->asSimpleXML();
   $items[] = array(
       'title' => (string) $xml->title,
       'link' => (string) $xml->link
  );
}

// Take the length of the array
$len = count($items);

// Get the number of pages
 $pages = ceil($len / $max);

// Calculate the starting point
$start = ceil(($curr_page - 1) * $max);

// return the portion of results
$arrayItem = array_slice($items, $start, $max);

for ($i = 0; $i <= 9; $i ++) {
    echo '<a href="' . $arrayItem[$i]['link'] . '">' . $arrayItem[$i]['title'] . '</a><br>';
 }

 // pagining stuff

 for ($i = 1; $i <= $pages; $i ++) {

   if ($i === (int) $page) {
       // current page

       $str[] = sprintf('<span style="color:red">%d</span>', $i);
   } else {

      $str[] = sprintf('<a href="?page=%d" style="color:green">%d</a>', $i, $i);
  }
}
  echo implode('', $str);

Answer 2

在这种情况下，请使用缓存，因为您不能部分解析XML。

Answer 3

Check this

<?php
if($_GET['page']!=""){
    $startPagenew = $_GET['page'];
    $startPage = $startPagenew-1;
}
else{
      $startPage = 0;
    }
    $perPage = 10;
    $currentRecord = 0;
    $xml = new SimpleXMLElement('http://sports.yahoo.com/mlb/teams/bos/rss.xml', 0, true);

    echo $startPage * $perPage;
      foreach($xml->channel->item as $key => $value)
        {
         $currentRecord += 1;

         if($currentRecord > ($startPage * $perPage) && $currentRecord < ($startPage * $perPage + $perPage)){

        echo "<a href=\"$value->link\">$value->title</a>";    

        echo "<br>";

        }
        }
//and the pagination:
//echo $currentRecord;
        for ($i = 1; $i <= ($currentRecord / $perPage); $i++) {
           echo("<a href='xmlpagination.php?page=".$i."'>".$i."</a>");
        } ?>

Updated

Check this Link

http://www.phpclasses.org/package/5667-PHP-Parse-XML-documents-and-return-arrays-of-elements.html

Answer 4

You can use Dom and Xpath. It should be much faster, since Xpath allows you to select nodes by their position in a list.

<?php  
$string = file_get_contents("http://oar.icrisat.org/cgi/exportview/subjects/s1=2E2/RSS2/s1=2E2.xml");


$dom = new DOMDocument('1.0', 'utf-8');
$dom->loadXML($string); 
$string = "";

$xpath = new DOMXPath($dom);

$channel = $dom->getElementsByTagName('channel')->item(0);

$numItems = $xpath->evaluate("count(item)", $channel); 
// get your paging logic

$start = 10;
$end = 20;

$items = $xpath->evaluate("item[position() >= $start and not(position() > $end)]", $channel);
$count = $start;
foreach($items as $item) {
    print_r("\r\n_____Node number $count ");
    print_r( $item->nodeName);
    $childNodes = $item->childNodes;
    foreach($childNodes as $childNode) { 
        print_r($childNode->nodeValue);
    }
    $count ++;
}

how to read only part of an xml file with php xmlreader

Question

4 answers

solution1
2 ACCPTED 2013-09-14 10:03:54

solution2
1 2013-09-04 15:07:34

solution3
1 2013-09-06 05:00:52

solution4
1 2013-09-09 11:34:51

how to read only part of an xml file with php xmlreader

Question

4 answers

solution1 2 ACCPTED 2013-09-14 10:03:54

solution2 1 2013-09-04 15:07:34

solution3 1 2013-09-06 05:00:52

solution4 1 2013-09-09 11:34:51

solution1
2 ACCPTED 2013-09-14 10:03:54

solution2
1 2013-09-04 15:07:34

solution3
1 2013-09-06 05:00:52

solution4
1 2013-09-09 11:34:51