简体   繁体   English

使用JavaScript(节点)解析HTML文档

[英]Parsing an HTML document using JavaScript (Node)

I'm attempting to parse an HTML document, but I literally have no idea where to start. 我正在尝试解析HTML文档,但实际上我不知道从哪里开始。

Lets say I have, <div><p>Hello world</p></div> 可以说我有<div><p>Hello world</p></div>

Is there a way to parse this, so I get something like 有没有办法解析这个,所以我得到类似

{ name: div,
  children: p
}

Shouldn't have been that hard to find through google Here's the link : https://www.npmjs.com/package/html-to-json 通过Google应该不难找到以下链接: https : //www.npmjs.com/package/html-to-json

htmlToJson.parse(html, filter, [callback]) -> promise The parse() method takes a string of HTML, and a filter, and responds with the filtered data. htmlToJson.parse(html,filter,[callback])-> promise parse()方法采用HTML字符串和过滤器,并以过滤后的数据作为响应。 This supports both callbacks and promises. 这支持回调和承诺。

  var promise = htmlToJson.parse('<div>content</div>', {
  'text': function ($doc) {
    return $doc.find('div').text();
  }
}, function (err, result) {
  console.log(result);
});

promise.done(function (result) {
  //Works as well 
});

Use Cheerio to parse html content to JSON. 使用Cheerio将html内容解析为JSON。 Try this link 试试这个连结

Help Link 帮助链接

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM