简体   繁体   English

Perl XML到CSV解析

[英]Perl XML to CSV Parse

I have been trying to figure out how to get an xml data source parsed into a CSV file and it's driving me a little crazy. 我一直在试图弄清楚如何将xml数据源解析为CSV文件,这让我有些疯狂。 I have a data source that I need to parse an create a CSV. 我有一个数据源,我需要解析一个CSV文件。 I also need to be able to include the Node ID as a column. 我还需要能够将“节点ID”作为一列包含在内。 Here is what I have: 这是我所拥有的:

         #!/usr/bin/perl
            use warnings;
        use strict;
        use XML::XPath;

        #Name of the CSV File
        my $filename = "parse.csv";

        #Create the file.
        open(INPUT,">$filename") or die "Cannot create file";

        #Collect the XML and set nodes
        my($xp) = XML::XPath->new( join('', <DATA>) );
        my(@records) = $xp->findnodes( '/CATALOG/CD' );
        my($firstTime) = 0;

        #Loop through each record
        foreach my $record ( @records ) {
            my(@fields) = $xp->find( './child::*', $record )->get_nodelist();
            unless ( $firstTime++ ) {
            #Print Headers
                print( join( ',', map { $_->getName() } @fields ), "\n");
            }
            #Print Content
                print( join( ',', map { $_->string_value() } @fields ), "\n");
        }
        #Close the file.
        close(INPUT);


        __DATA__
        <FOOD>
            <ITEM id='1'>
                <Color>Brown</Color>
                <Name>Steak</Name>
            </ITEM>
            <ITEM id='2'>
                <Color>Blue</Color>
                <Name>Blueberries</Name>
            </ITEM>
            <ITEM id='3'>
                <Color>Red</Color>
                <Name>Apple</Name>
            </ITEM>
        </FOOD>

It creates a CSV but its empty & I think its because of the print lines in the foreach loop. 它创建了一个CSV,但是它是空的,我认为是因为foreach循环中的打印行。

Any help would be greatly appreciated! 任何帮助将不胜感激!

You are printing your headers and content to Standard Output, not to your output file. 您正在将标题和内容打印到标准输出,而不是输出到输出文件。 You need to pass the file handle as the first argument to print without a comma between it and what you want to print. 您需要将文件句柄作为第一个参数传递,以在print要与要打印的内容之间保持逗号之间没有逗号。 Something like: print FILE join(',', ...), "\\n"; 类似于: print FILE join(',', ...), "\\n";

I would also recommend not using INPUT as the file handle you are outputting to - it makes it a little confusing to understand the code. 我还建议不要将INPUT用作要输出到的文件句柄-这会使理解代码有些混乱。

Given the simplicity of the XML schema, this easier to do with AnyData 鉴于XML模式的简单性,使用AnyData更容易

For instance: 例如:

#!/usr/bin/perl
# This script converts a XML file to CSV format.

# Load the AnyData XML to CSV conversion modules
use XML::Parser;
use XML::Twig;
use AnyData;

my $input_xml = "test.xml";
my $output_csv = "test.csv";


$flags->{record_tag} = 'ITEM';
adConvert( 'XML', $input_xml, 'CSV', $output_csv, $flags );

Would convert your data structure (XML) into: 将您的数据结构(XML)转换为:

id,Color,Name
1,Brown,Steak
2,Blue,Blueberries
3,Red,Apple

In your case , you are using /CATALOG/CD rather than your data. 在您的情况下,您正在使用/ CATALOG / CD而不是数据。 Please use something like 请使用类似

my(@records) = $xp->findnodes( '/FOOD/ITEM' );
....
...
...
print INPUT ( join( ',', map { $_->getName() } @fields ), "\n" );

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM