繁体   English   中英

使用 Fetch API 从多个 URL 存储 HTML 数据,并将它们存储在 javascript 字典中的键、值对中 ZA8CFDE6331149EB2ACZ66F96666

[英]Store HTML data from multiple URLs using Fetch API and store them in key, value pair in a javascript dictionary object

我正在尝试使用Fetch APIfor() loop中的多个 URL 获取 html 数据。 最初我的pos_links_info是这样的:

link 0 : contains URL of first webpage
link 1 : contains URL of second webpage...//and so on

这是我的代码:

var sl_no = 0;
for(const [key,value] of Object.entries(pos_links_info)){

   //Here value is the URL of that website

   fetch('https://api.codetabs.com/v1/proxy?quest='+value)
      .then(
          function (response) {
             if (response.status !== 200) {
                 console.log('Unable to fetch data from url link: ' +response.status);
                 return;
             }
             response.text().then(function(data) {
                var date = new Date();
                var parser = new DOMParser();
                var htmldoc = parser.parseFromString(data, "text/html");
                var title;
                if(htmldoc.getElementsByTagName("h1")[0] != null)
                   title = htmldoc.getElementsByTagName("h1")[0].innerText;
                else
                   title = htmldoc.getElementsByTagName("title")[0].innerText;
                if(title != ""){
                   pos_links_info["title"+" "+sl_no]=title.trim();
                }
                if(htmldoc.getElementsByTagName("img") != null){
                    for(var l=0; l<htmldoc.getElementsByTagName("img").length; l++){ 
                       if(htmldoc.getElementsByTagName("img")[l].src.includes(value.split("https://")[1].split("/")[0]) && 
                           htmldoc.getElementsByTagName("img")[l].src.includes(date.getFullYear().toString())){
                              pos_links_info["image"+" "+sl_no] = htmldoc.getElementsByTagName("img")[l].src.trim();
                              break;
                       }
                    }
                }
                if(htmldoc.getElementsByTagName("p") != null){
                    pos_count_body = htmldoc.getElementsByTagName("p").length-1;
                    for(var l=0; l<htmldoc.getElementsByTagName("p").length-1; l++){
                       pos_links_info["body"+" "+sl_no+" "+l] =  htmldoc.getElementsByTagName("p")[l].innerText.trim();
                    }
                }
           })
        }
     )
     .catch(function(err) {
         console.log('Fetch Error :-S', err);
     });

     sl_no += 1;  
 }

pos_links_info object 应该包含从键值对中的 URL 获取的所有 HTML 数据。 这应该是格式:

/*----------DATA OF THE FIRST WEBPAGE-----------------------------------------*/
link 0 : contains the URL of the first webpage
title 0 : contains the title of first webpage
image 0 : contains an image present in first webpage
body 0 0 : contains first <p> of first webpage
body 0 1 : contains second <p> of second webpage
body 0 2 : contains third <p> of third webpage....// and so one for all the <p>
/*----------------------------------------------------------------------------*/

/*----------DATA OF THE SECOND WEBPAGE----------------------------------------*/
link 1 : contains the URL of the second webpage
title 1 : contains the title of second webpage
image 1 : contains an image present in second webpage
body 1 0 : contains first <p> of second webpage
body 1 1 : contains second <p> of second webpage
body 1 2 : contains third <p> of second webpage....// and so one for all the <p>
/*----------------------------------------------------------------------------*/

//And the format should be like this for all the web URLs

但是我的 output 是这样的:

link 0 : contains URL of the first webpage //OK
link 1 : contains URL of the second webpage //OK
title 0 : contains title of the second webpage //it should be of the first webpage
image 0 : contains an image present in first webpage
body 0 0 : contains first <p> of second webpage //it should be of the first webpage
body 0 1 : contains second <p> of second webpage //it should be of the first webpage

//and it continues like this

如您所见,获取的数据存储在不正确的键值对中。 我该如何解决?

使用Promise.all()发送多个 URL 的请求,

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM