[英]How to scrape followers of instagram account with node.js, cheerio and InstAuto/Puppeteer
I'm trying to make a program that creates list of certain user follows and vice versa.我正在尝试制作一个创建某些用户关注列表的程序,反之亦然。 After Instagram graph api shut down it became a hard task.在 Instagram 图 api 关闭后,它变成了一项艰巨的任务。 I got to a point in which I have a correct div selected, but the javascript command just somehow doesn't work.我已经选择了正确的 div,但是 javascript 命令不知何故不起作用。 The exact same command inserted in browser console gives a nice array, but here - undefined, no matter which metod I use: cheerio and jquery or vanilla js with document.queryAll.插入浏览器控制台的完全相同的命令给出了一个很好的数组,但在这里 - 未定义,无论我使用哪种方法:cheerio 和 jquery 或带有 document.queryAll 的 vanilla js。 Can you help me out?你能帮我吗? Code:代码:
//scrape followers
await page.goto('https://www.instagram.com/fabiawdizlu/followers/');
await waitFor(5000);
const html2 = await page.content();
await waitFor(5000);
const $2 = cheerio.load(html2);
const followersList2 = $2('._aacl._aaco._aacw._adda._aacx._aad7._aade').eq(0).text();
console.log(followersList2);
const follow3 = page.evaluate(() => {
var f3 = document.querySelectorAll('_aacl _aaco _aacw _adda _aacx _aad7 _aade')[0];
return f3;
}).then((f3) => {
// console.log(f3.eq(1).text());
// console.log(f3.eq(2).text());
// console.log(f3.eq(3).text());
// console.log(f3.eq(4).text());
console.log(f3)
// for (let i = 0; i < 10; i++) {
// console.log(f3[i].innerText);
// }
})
This above is one of many methods I tried.以上是我尝试过的许多方法之一。 For loop doesn't work, jquery's/cheerio eq(i) doesn't work (it displays user of particular id, but doesn't give me array as I want), page evaluate doesn't work. For 循环不起作用,jquery/cheerio eq(i) 不起作用(它显示特定 id 的用户,但没有给我想要的数组),页面评估不起作用。 Maybe I'm doing something wrong, it's my second node project.也许我做错了什么,这是我的第二个节点项目。
Thanks for your time, cheers, Maciej谢谢你的时间,干杯,Maciej
With cheerio you can't execute javascript. Cheerio 不能执行 javascript。 I think you should use playwright, this will execute javascript and load data dynamically.我认为您应该使用 playwright,这将执行 javascript 并动态加载数据。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.