简体   繁体   中英

Perl XML to CSV Parse

I have been trying to figure out how to get an xml data source parsed into a CSV file and it's driving me a little crazy. I have a data source that I need to parse an create a CSV. I also need to be able to include the Node ID as a column. Here is what I have:

         #!/usr/bin/perl
            use warnings;
        use strict;
        use XML::XPath;

        #Name of the CSV File
        my $filename = "parse.csv";

        #Create the file.
        open(INPUT,">$filename") or die "Cannot create file";

        #Collect the XML and set nodes
        my($xp) = XML::XPath->new( join('', <DATA>) );
        my(@records) = $xp->findnodes( '/CATALOG/CD' );
        my($firstTime) = 0;

        #Loop through each record
        foreach my $record ( @records ) {
            my(@fields) = $xp->find( './child::*', $record )->get_nodelist();
            unless ( $firstTime++ ) {
            #Print Headers
                print( join( ',', map { $_->getName() } @fields ), "\n");
            }
            #Print Content
                print( join( ',', map { $_->string_value() } @fields ), "\n");
        }
        #Close the file.
        close(INPUT);


        __DATA__
        <FOOD>
            <ITEM id='1'>
                <Color>Brown</Color>
                <Name>Steak</Name>
            </ITEM>
            <ITEM id='2'>
                <Color>Blue</Color>
                <Name>Blueberries</Name>
            </ITEM>
            <ITEM id='3'>
                <Color>Red</Color>
                <Name>Apple</Name>
            </ITEM>
        </FOOD>

It creates a CSV but its empty & I think its because of the print lines in the foreach loop.

Any help would be greatly appreciated!

You are printing your headers and content to Standard Output, not to your output file. You need to pass the file handle as the first argument to print without a comma between it and what you want to print. Something like: print FILE join(',', ...), "\\n";

I would also recommend not using INPUT as the file handle you are outputting to - it makes it a little confusing to understand the code.

Given the simplicity of the XML schema, this easier to do with AnyData

For instance:

#!/usr/bin/perl
# This script converts a XML file to CSV format.

# Load the AnyData XML to CSV conversion modules
use XML::Parser;
use XML::Twig;
use AnyData;

my $input_xml = "test.xml";
my $output_csv = "test.csv";


$flags->{record_tag} = 'ITEM';
adConvert( 'XML', $input_xml, 'CSV', $output_csv, $flags );

Would convert your data structure (XML) into:

id,Color,Name
1,Brown,Steak
2,Blue,Blueberries
3,Red,Apple

In your case , you are using /CATALOG/CD rather than your data. Please use something like

my(@records) = $xp->findnodes( '/FOOD/ITEM' );
....
...
...
print INPUT ( join( ',', map { $_->getName() } @fields ), "\n" );

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM