简体   繁体   English

在 Node.js 中读取 XML 文件

[英]Reading XML file in Node.js

I'm learning how to use Node.我正在学习如何使用 Node.js。 At this time, I have an XML file that looks like this:此时,我有一个如下所示的 XML 文件:

sitemap.xml站点地图.xml

<?xml version="1.0" encoding="utf-8"?>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"   xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
  <url>
    <loc>http://www.example.com</loc>
    <lastmod>2015-10-01</lastmod>
    <changefreq>monthly</changefreq>
  </url>

  <url>
    <loc>http://www.example.com/about</loc>
    <lastmod>2015-10-01</lastmod>
    <changefreq>never</changefreq>
  </url>

  <url>
    <loc>http://www.example.com/articles/tips-and-tricks</loc>
    <lastmod>2015-10-01</lastmod>
    <changefreq>never</changefreq>
    <article:title>Tips and Tricks</blog:title>
    <article:description>Learn some of the tips-and-tricks of the trade</article:description>
  </url>
</urlset>

I am trying to load this XML in my Node app.我正在尝试在我的 Node 应用程序中加载这个 XML。 When loaded, I want to only get the url elements that include the use of the <article: elements.加载时,我只想获取包含使用<article:元素的url元素。 At this time, I'm stuck though.在这个时候,我被卡住了。 Right now, I'm using XML2JS via the following:现在,我通过以下方式使用XML2JS

var parser = new xml2js.Parser();
fs.readFile(__dirname + '/../public/sitemap.xml', function(err, data) {
    if (!err) {
        console.log(JSON.stringify(data));
    }
});

When the console.log statement is executed, I just see a bunch of numbers in the console window.执行console.log语句时,我只是在控制台窗口看到一堆数字。 Something like this:像这样的东西:

{"type":"Buffer","data":[60,63,120, ...]}

What am I missing?我错过了什么?

use xml2json使用 xml2json

https://www.npmjs.com/package/xml2json https://www.npmjs.com/package/xml2json

 fs = require('fs'); var parser = require('xml2json'); fs.readFile( './data.xml', function(err, data) { var json = parser.toJson(data); console.log("to json ->", json); });

From the documentation .文档中

The callback is passed two arguments (err, data), where data is the contents of the file.回调传递了两个参数 (err, data),其中 data 是文件的内容。

If no encoding is specified, then the raw buffer is returned.如果未指定编码,则返回原始缓冲区。

If options is a string, then it specifies the encoding.如果 options 是字符串,则它指定编码。 Example:例子:

 fs.readFile('/etc/passwd', 'utf8', callback);

You didn't specify an encoding, so you get the raw buffer.您没有指定编码,因此您获得了原始缓冲区。

@Sandburg mentioned xml-js in a comment and it worked best for me (several years after this question was asked). @Sandburg 在评论中提到了xml-js ,它对我来说效果最好(在提出这个问题几年后)。 The others I tried were: xml2json which required some Windows Sdk that I did not want to deal with, and xml2js that did not provide an easy enough OTB way to search through attributes.我尝试了其他人: xml2json这需要一些Windows SDK的,我不想处理,并xml2js这并不能提供一个很容易OTB方式,通过属性进行搜索。

I had to pull out a specific attribute in an xml file 3 nodes deep and xml-js did it with ease.我不得不在 3 个节点深的 xml 文件中提取特定属性,而xml-js轻松做到了。

https://www.npmjs.com/package/xml-js https://www.npmjs.com/package/xml-js

With the following example file stats.xml使用以下示例文件stats.xml

<stats>
  <runs>
    <latest date="2019-12-12" success="100" fail="2" />
    <latest date="2019-12-11" success="99" fail="3" />
    <latest date="2019-12-10" success="102" fail="0" />
    <latest date="2019-12-09" success="102" fail="0" />
  </runs>
</stats>

I used xml-js to find the element /stats/runs/latest with attribute @date='2019-12-12' like so我使用xml-js来查找元素/stats/runs/latest与属性@date='2019-12-12'像这样

const convert = require('xml-js');
const fs = require('fs');

// read file
const xmlFile = fs.readFileSync('stats.xml', 'utf8');

// parse xml file as a json object
const jsonData = JSON.parse(convert.xml2json(xmlFile, {compact: true, spaces: 2}));

const targetNode = 

    // element '/stats/runs/latest'
    jsonData.stats.runs.latest

    .find(x => 

        // attribute '@date'
        x._attributes.date === '2019-12-12'
    );

// targetNode has the 'latest' node we want
// now output the 'fail' attribute from that node
console.log(targetNode._attributes.fail);  // outputs: 2

fs.readFile has an optional second parameter: encoding. fs.readFile 有一个可选的第二个参数:编码。 If you do not include this parameter it will automatically return you a Buffer object.如果您不包含此参数,它将自动返回一个 Buffer 对象。

https://nodejs.org/api/fs.html#fs_fs_readfile_filename_options_callback https://nodejs.org/api/fs.html#fs_fs_readfile_filename_options_callback

If you know the encoding just use:如果您知道编码,请使用:

fs.readFile(__dirname + '/../public/sitemap.xml', 'utf8', function(err, data) {
    if (!err) {
        console.log(data);
    }
});

You can try this你可以试试这个

npm install express-xml-bodyparser --save

at Client side:-在客户端:-

 $scope.getResp = function(){
     var posting = $http({
           method: 'POST',
           dataType: 'XML',
           url: '/getResp/'+$scope.user.BindData,//other bind variable
           data: $scope.project.XmlData,//xmlData passed by user
           headers: {
              "Content-Type" :'application/xml'
            },
           processData: true
           });
       posting.success(function(response){
       $scope.resp1 =  response;
       });
   };

on Server side:-在服务器端:-

xmlparser = require('express-xml-bodyparser');
app.use(xmlparser());
app.post('/getResp/:BindData', function(req, res,next){
  var tid=req.params.BindData;
  var reqs=req.rawBody;
  console.log('Your XML '+reqs);
});

You can also use regex before parsing to remove elements not matching your conditions :您还可以在解析之前使用正则表达式删除与您的条件不匹配的元素:

var parser = new xml2js.Parser();
fs.readFile(__dirname + '/../public/sitemap.xml', "utf8",function(err, data) {
    // handle err...

    var re = new RegExp("<url>(?:(?!<article)[\\s\\S])*</url>", "gmi")
    data = data.replace(re, ""); // remove node not containing article node
    console.log(data);
    //... parse data ...



});

Example :例子 :

   var str = "<data><url><hello>abc</hello><moto>abc</moto></url><url><hello>bcd</hello></url><url><hello>efd</hello><moto>poi</moto></url></data>";
   var re = new RegExp("<url>(?:(?!<moto>)[\\s\\S])*</url>", "gmi")
   str = str.replace(re, "")

   // "<data><url><hello>abc</hello><moto>abc</moto></url><url><hello>efd</hello><moto>poi</moto></url></data>"

step 1 npm install xml2js --save第 1 步 npm install xml2js --save

const xml2js = require('xml2js');
const fs = require('fs');
const parser = new xml2js.Parser({ attrkey: "ATTR" });

// this example reads the file synchronously
// you can read it asynchronously also
let xml_string = fs.readFileSync("data.xml", "utf8");

 parser.parseString(xml_string, function(error, result) {
   if(error === null) {
      console.log(result);
  }
  else {
    console.log(error);
  }

}); });

For an express server:对于快速服务器:

  app.get('/api/rss/', (_request: Request, response: Response) => {
    const rssFile = fs.readFileSync(__dirname + '/rssFeeds/guardian.xml', { encoding: 'utf8' })

    console.log('FILE', rssFile)

    response.set('Content-Type', 'text/xml')
    response.send(rssFile)
  })
  • Take request接受请求
  • Read File读取文件
  • Set xml header设置xml标题
  • Return file返回文件

In order toread an XML file in Node , I like the XML2JS package .为了在 Node 中读取 XML 文件,我喜欢XML2JS 包 This package lets me easily work with the XML in JavaScript then.这个包让我可以轻松地在 JavaScript 中使用 XML。

var parser = new xml2js.Parser();       
parser.parseString(fileData.substring(0, fileData.length), function (err, result) {
  var json = JSON.stringify(result);
});

coming late to this thread, just to add one simple tip here, if you plan to use parsed data in js or save it as json file, be sure to set explicitArray to false .这个帖子来晚了,在这里补充一个简单的提示,如果你打算在 js 中使用解析的数据或将其保存为 json 文件,请确保将explicitArray设置为false The output will be more js-friendly输出将更加 js 友好

so it will look like,所以它看起来像,
letparser=newxml2js.Parser({explicitArray:false})

Ref: https://github.com/Leonidas-from-XIV/node-xml2js参考: https : //github.com/Leonidas-from-XIV/node-xml2js

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM