简体   繁体   中英

Load specific HTML by using cheerio in nodejs?

I need to get all <a> tag url from given webpage. And also I need to avoid <a> tag between header and footer tags. I am loading body tag html but without header tag. Here is my code but it doesn't work.

var $ = cheerio.load(html);
$ = cheerio.load($('body').not('header'));

var links = $("a']");
links.each(function() {
    console.log($(this).attr('href'));
});

If above code is wrong please suggest how to do that?

Cheerio works just like jQuery.

var $ = cheerio.load(html);
var links = $('body').not('header').find('a');
// $('body:not(header) a') may also work.

links.each(function() {
    console.log(this.href);
});

I think the error was because you weren't loading the HTML in your second load. You were loading the body object. You should be able to do it this way:

var $ = cheerio.load(html);
$ = cheerio.load($('body').html());

$('header').remove();

console.log($.html());

I did like this now its working fine ... Can any one tell me is this right way do this ?...

var $ = cheerio.load(body);
var t = $('body');
t.children('header').remove();
t.children('footer').remove();
var t = $.html(t);
var $ = cheerio.load(t);
var links = $("a");
links.each(function() {
    console.log($(this).attr('href'));
});

Thanks,

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM