[英]Using an async function inside a loop
我正在使用一个名为 puppeteer 的 javascript 库,我有一个异步函数来搜索页面(和其他东西)内的所有 iframe,如下所示:
function check_page(web_page){
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(web_page);
/*. my code */
await browser.close();
})();
}
我有另一个函数,我用来读取包含最流行网站列表的 csv 文件,对于每个站点,我必须使用它的字符串像参数一样调用前一个函数:
function readCSV(csv){
var lines=csv.split("\n");
var result = [];
var headers=lines[0].split(",");
for(var i=0;i<lines.length;i++){
//console.log("lines: "+lines[i])
var obj = {};
var currentline=lines[i].split(",");
console.log("currentline: "+currentline[1])
check_page("https://www."+currentline[1]). // pass the site to the function like: https://www.itsname...
}
}
但这不起作用。 它有时适用于列表的最后一个网站,但通常会出现以下错误:
UnhandledPromiseRejectionWarning: Error: Protocol not supported.
at exports.XMLHttpRequest.send (/Users/francesco/node_modules/xmlhttprequest/lib/XMLHttpRequest.js:299:15)
at processTicksAndRejections (internal/process/task_queues.js:97:5)
(node:1228) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:1228) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
我的 file.js 具有以下结构:
const puppeteer = require('puppeteer');
const fs = require('fs')
const fileContents = fs.readFileSync('./popular_website.csv').toString()
function readCSV(csv){
// previous code
}
function check_page(web_page){
// previous code
}
readCSV(fileContents)
编辑:我更改了我的功能,但它不起作用,它只能在最后一个网站上运行。 我发布了整个功能:
async function check_page(web_page){
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(web_page)
/* I search every iframe tag inside web-page and then I send a request for eachone of that for reading csp and x-frame-option from the header*/
for (const frame of page.mainFrame().childFrames()){
if(frame.url().toString() == "about:blank"){
console.log("blank")
}
else{
/* I send for every iframe an http request for retrieve the policies from http header */
var XMLHttpRequest = require("xmlhttprequest").XMLHttpRequest;
var req = new XMLHttpRequest();
req.open('GET', frame.url(), false);
//req.send(null)
var headers = req.getAllResponseHeaders().toLowerCase();
var arr = headers.trim().split(/[\r\n]+/);
// Create a map of header names to values
var headerMap = {};
arr.forEach(function (line) {
var parts = line.split(': ');
var header = parts.shift();
var value = parts.join(': ');
headerMap[header] = value;
});
console.log("policy of:"+frame.url());
console.log("CSP: "+headerMap["content-security-policy"]);
console.log("x-frame-options: "+headerMap["x-frame-options"]);
console.log("-----------------------------------------------------------------")
}
}
await browser.close();
}
在 for 循环中调用 promise 并不总是一个好主意。
由于您无法控制应用程序的时间,您将自己暴露在奇怪的副作用中。
尝试将您的电话分组在 Promise.all() 中:
async function check_page(web_page){
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(web_page);
/*. my code */
await browser.close();
}
function readCSV(csv){
var lines=csv.split("\n");
var result = [];
var headers=lines[0].split(",");
Promise.all(
lines.map(line => {
var obj = {};
var currentline = line.split(",");
console.log("currentline: "+currentline[1])
return check_page("https://www."+currentline[1])
})
).then(() => console.log('It worked')).catch(err => /* catch any error in Promise.all*/);
}
还要检查您的 check_page 功能。 似乎有一个协议错误阻止了您的承诺解决。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.