[英]A lot of requests in same moment
我想抓取很多页面(350/1500),当我尝试使用cheerio和另一个使用循环的所有链接(350/1500)的功能来创建请求和抓取的功能时,但只能从中获取“当前div”(正文)前五个 - 另一个是空的或 0。如何编写一个函数来等待页面加载并准备好下载和提取项目?
This answer is very high level because I don't know your use case这个答案非常高,因为我不知道你的用例
const urls = []; // populate it with page urls
async function start() {
try {
for(const url of urls) {
const html = await getPageHtml(URL);
const scrappedData = await getScrappedData(HTML);
}
} catch (err) {
console.log(err)
}
}
async getPageHtml(url) {
// use request library promised version for fetching data and await for the response
}
async getScrappedData(html) {
return new Promise((resolve, reject) => {
// call resolve with data like this
resolve(data)
// if faced any error then call reject like this
reject(err)
})
}
start()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.