I'm generating XML from InDesign and would like to parse the XML in PHP. Below is a sample of the XML that InDesign is generating:
<?xml version="1.0" encoding="UTF-8"?>
<Root>
<page title="About Us">
About Us
<page>Overiew</page>
<page>Where We Started</page>
<page>Help</page>
</page>
<page>
Automobiles
<page>
Cars
<page>Small</page>
<page>Medium</page>
<page>Large</page>
</page>
<page>
Trucks
<page>Flatbet</page>
<page>
Pickup
<page>Dodge</page>
<page>Nissan</page>
</page>
</page>
</page>
</Root>
I'm using the following PHP code to parse the XML recursively.
header('Content-type: text/plain');
function parse_recursive(SimpleXMLElement $element, $level = 0)
{
$indent = str_repeat("\t", $level); // determine how much we'll indent
$value = trim((string) $element); // get the value and trim any whitespace from the start and end
$attributes = $element->attributes(); // get all attributes
$children = $element->children(); // get all children
echo "{$indent}Parsing '{$element->getName()}'...".PHP_EOL;
if(count($children) == 0 && !empty($value)) // only show value if there is any and if there aren't any children
{
echo "{$indent}Value: {$element}".PHP_EOL;
}
// only show attributes if there are any
if(count($attributes) > 0)
{
echo $indent.'Has '.count($attributes).' attribute(s):'.PHP_EOL;
foreach($attributes as $attribute)
{
echo "{$indent}- {$attribute->getName()}: {$attribute}".PHP_EOL;
}
}
// only show children if there are any
if(count($children))
{
echo $indent.'Has '.count($children).' child(ren):'.PHP_EOL;
foreach($children as $child)
{
parse_recursive($child, $level+1); // recursion :)
}
}
echo $indent.PHP_EOL; // just to make it "cleaner"
}
$xml = new SimpleXMLElement('data.xml', null, true);
parse_recursive($xml);
The issue that I'm having is that when I parse the XML, I'm not getting the text values of each page node unless completely surrounded by a page tag. So, for example, I have no way of reading "About Us" unless looking at the title attribute (if it exists). The same applies for "Automobiles" and "Cars" and "Trucks".
Again, this is generated XML from InDesign. I could ask designers to add attributes to nodes, etc. but I'm trying to minimize the amount of data entry.
I believe the XML is well formed. Any help would be greatly appreciated.
You ignore all text values, if node has any childs, to change that replace:
if(count($children) == 0 && !empty($value)) // only show value if there is any and if there aren't any children
{
echo "{$indent}Value: {$element}".PHP_EOL;
}
with
if(!empty($value)) // only show value if there is anychildren
{
echo "{$indent}Value: {$value}".PHP_EOL;
}
an then result with sample data is:
Parsing 'Root'...
Has 2 child(ren):
Parsing 'page'...
Value: About Us
Has 1 attribute(s):
- title: About Us
Has 3 child(ren):
Parsing 'page'...
Value: Overiew
Parsing 'page'...
Value: Where We Started
Parsing 'page'...
Value: Help
Parsing 'page'...
Value: Automobiles
Has 2 child(ren):
Parsing 'page'...
Value: Cars
Has 3 child(ren):
Parsing 'page'...
Value: Small
Parsing 'page'...
Value: Medium
Parsing 'page'...
Value: Large
Parsing 'page'...
Value: Trucks
Has 2 child(ren):
Parsing 'page'...
Value: Flatbet
Parsing 'page'...
Value: Pickup
Has 2 child(ren):
Parsing 'page'...
Value: Dodge
Parsing 'page'...
Value: Nissan
Of course, I struggled with this but as soon as I ask the question I find the answer. Anyway, this approach worked (top answer):
How to get a specific node text using php DOM
I'm wondering if there's any other way, though.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.