简体   繁体   English

在Node.js中解析大型XML文件

[英]Parsing large XML file in Node.js

So I have an XML file that is larger than 70mb. 所以我有一个大于70mb的XML文件。 I would like to parse this data for in Node.js to do data visualizations on it eventually. 我想在Node.js中解析这些数据,最终对它进行数据可视化。 To start, I thought it would be best to use JSON instead of XML, because Node.js is better built to work with JSON. 首先,我认为最好使用JSON而不是XML,因为Node.js更适合使用JSON。 So I planned to use the xml2json node module to parse the xml into JSON but I can't seem to write the xml file to a variable because its so large. 所以我打算使用xml2json节点模块将xml解析为JSON,但我似乎无法将xml文件写入变量,因为它太大了。 I attempted to do this with the following code. 我尝试使用以下代码执行此操作。

var fs = require('fs');


fs.readFile(__dirname + '/xml/ipg140114.xml', 'utf8', function(err, data, parseXml) {
    if(err) {
        return console.log(err);
    } 
});

I receive a stack trace error. 我收到堆栈跟踪错误。 Whats a better way to get this file converted into JSON so I can parse it with Node? 什么是将此文件转换为JSON的更好方法,以便我可以使用Node解析它? I am pretty new to Node so let me know if my approach is wrong. 我是Node的新手,所以让我知道我的方法是否错误。 Thanks in advance! 提前致谢!

Json2xml requires you to load the entire file into memory. Json2xml要求您将整个文件加载到内存中。 You could allocate more memory but I would recommend parsing the XML directly from the file instead. 您可以分配更多内存,但我建议您直接从文件中解析XML。

There are other libraries on NPM such as xml-stream that will allow you to parse the XML directly form the file without loading it all into memory. NPM上还有其他库,例如xml-stream ,它允许您直接从文件中解析XML而不将其全部加载到内存中。

My personal issue with xml-stream is that it relies on GYP , which can be a hassle if you're a windows user. 我对xml-stream个人问题是它依赖于GYP ,如果你是一个Windows用户,这可能会很麻烦。 I added a very basic parser called no-gyp-xml-stream to NPM, this one only depends on sax. 我向NPM添加了一个名为no-gyp-xml-stream的非常基本的解析器,这个解析器只依赖于sax。 But it's a bit rudimentary and may not suit your needs. 但它有点简陋,可能不适合您的需求。
I am however willing to improve it if anyone needs anything: https://www.npmjs.com/package/no-gyp-xml-stream 但是,如果有人需要,我愿意改进它: https//www.npmjs.com/package/no-gyp-xml-stream

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM