简体   繁体   English

解析 XML 和使用 xml-stream 收集子级的问题

[英]Problems parsing XML and collecting children with xml-stream

I am trying to parse very large XML files with xml-stream in a nodejs script.我正在尝试在 nodejs 脚本中使用 xml-stream 解析非常大的 XML 文件。

xml-stream can be found here - https://github.com/assistunion/xml-stream xml-stream 可以在这里找到 - https://github.com/assistunion/xml-stream

   <?xml version="1.0"?>
<Products xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="en-us" version="0.96" versionTimestamp="2012-02-07T03:00:00Z" fooKey="6402420af51e08">
    <Product>
        <id>296834</id>
        <name>Thing</name>
        <Photos>
            <Photo>
                <MediaURL>http://url.com/to/image/file</MediaURL>
            </Photo>
            <Photo>
                <MediaURL>http://url.com/to/image/secondfile</MediaURL>
            </Photo>
            <Photo>
                <MediaURL>http://url.com/to/image/thirdfile</MediaURL>
            </Photo>
        </Photos>
    </Product>
</Products>

And my nodejs code looks like this...我的 nodejs 代码看起来像这样......

var fs        = require('fs')
, path      = require('path')
, XmlStream = require('xml-stream')
;

// Create a file stream and pass it to XmlStream
var stream = fs.createReadStream(path.join(__dirname, 'samplekirby.xml'));
var xml = new XmlStream(stream);

xml.preserve('Product', true);
xml.collect('Photos');
xml.on('endElement: Product', function(item) {
   console.log(item);
});

The output...输出...

{ '$children': 
[ { '$children': [Object], '$text': '296834', '$name': 'id' },
 { '$children': [Object], '$text': 'Thing', '$name': 'name' },
 { '$children': [Object], Photo: [Object], '$name': 'Photos' } ],
id: { '$children': [ '296834' ], '$text': '296834', '$name': 'id' },
name: { '$children': [ 'Thing' ], '$text': 'Thing', '$name': 'name' },
Photos: 
{ '$children': [ [Object], [Object], [Object] ],
 Photo: { '$children': [Object], MediaURL: [Object], '$name': 'Photo' },
 '$name': 'Photos' },
'$name': 'Product' }

How do I get the image URLs?如何获取图片网址?

I have tried .collect() and .preserve() on various nodes in various orders.我已经尝试了 .collect() 和 .preserve() 在各种节点上的各种顺序。 There doesn't seem to be a lot of more complicated usage examples for this lib.这个库似乎没有很多更复杂的使用示例。 I have very large XML files, and xml2js couldn't handle it.我有非常大的 XML 文件,而 xml2js 无法处理它。 I would be happy with this lib choice if I could figure out how to increase the depth in some fashion.如果我能想出如何以某种方式增加深度,我会对这个 lib 选择感到满意。

If you just want to get the URLs如果您只想获取网址

var fs = require('fs'),
    path = require('path'),
    XmlStream = require('xml-stream');

// Create a file stream and pass it to XmlStream
var stream = fs.createReadStream(path.join(__dirname, 'sample.xml'));
var xml = new XmlStream(stream);

xml.collect('Photo');
xml.on('endElement: Product', function(product) {
    console.log(JSON.stringify(product, null, 2));
})

Output:输出:

{
  "id": "296834",
  "name": "Thing",
  "Photos": {
    "Photo": [
      {
        "MediaURL": "http://url.com/to/image/file"
      },
      {
        "MediaURL": "http://url.com/to/image/secondfile"
      },
      {
        "MediaURL": "http://url.com/to/image/thirdfile"
      }
    ]
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM