简体   繁体   中英

Store HTML data from multiple URLs using Fetch API and store them in key, value pair in a javascript dictionary object

I am trying to fetch the html data from multiple URLs inside a for() loop , using Fetch API . Initally my pos_links_info is like this:

link 0 : contains URL of first webpage
link 1 : contains URL of second webpage...//and so on

This is my code:

var sl_no = 0;
for(const [key,value] of Object.entries(pos_links_info)){

   //Here value is the URL of that website

   fetch('https://api.codetabs.com/v1/proxy?quest='+value)
      .then(
          function (response) {
             if (response.status !== 200) {
                 console.log('Unable to fetch data from url link: ' +response.status);
                 return;
             }
             response.text().then(function(data) {
                var date = new Date();
                var parser = new DOMParser();
                var htmldoc = parser.parseFromString(data, "text/html");
                var title;
                if(htmldoc.getElementsByTagName("h1")[0] != null)
                   title = htmldoc.getElementsByTagName("h1")[0].innerText;
                else
                   title = htmldoc.getElementsByTagName("title")[0].innerText;
                if(title != ""){
                   pos_links_info["title"+" "+sl_no]=title.trim();
                }
                if(htmldoc.getElementsByTagName("img") != null){
                    for(var l=0; l<htmldoc.getElementsByTagName("img").length; l++){ 
                       if(htmldoc.getElementsByTagName("img")[l].src.includes(value.split("https://")[1].split("/")[0]) && 
                           htmldoc.getElementsByTagName("img")[l].src.includes(date.getFullYear().toString())){
                              pos_links_info["image"+" "+sl_no] = htmldoc.getElementsByTagName("img")[l].src.trim();
                              break;
                       }
                    }
                }
                if(htmldoc.getElementsByTagName("p") != null){
                    pos_count_body = htmldoc.getElementsByTagName("p").length-1;
                    for(var l=0; l<htmldoc.getElementsByTagName("p").length-1; l++){
                       pos_links_info["body"+" "+sl_no+" "+l] =  htmldoc.getElementsByTagName("p")[l].innerText.trim();
                    }
                }
           })
        }
     )
     .catch(function(err) {
         console.log('Fetch Error :-S', err);
     });

     sl_no += 1;  
 }

The pos_links_info object should contain all the HTML data fetched from the URLs in key, value pairs. This should be the format:

/*----------DATA OF THE FIRST WEBPAGE-----------------------------------------*/
link 0 : contains the URL of the first webpage
title 0 : contains the title of first webpage
image 0 : contains an image present in first webpage
body 0 0 : contains first <p> of first webpage
body 0 1 : contains second <p> of second webpage
body 0 2 : contains third <p> of third webpage....// and so one for all the <p>
/*----------------------------------------------------------------------------*/

/*----------DATA OF THE SECOND WEBPAGE----------------------------------------*/
link 1 : contains the URL of the second webpage
title 1 : contains the title of second webpage
image 1 : contains an image present in second webpage
body 1 0 : contains first <p> of second webpage
body 1 1 : contains second <p> of second webpage
body 1 2 : contains third <p> of second webpage....// and so one for all the <p>
/*----------------------------------------------------------------------------*/

//And the format should be like this for all the web URLs

But my output is like this:

link 0 : contains URL of the first webpage //OK
link 1 : contains URL of the second webpage //OK
title 0 : contains title of the second webpage //it should be of the first webpage
image 0 : contains an image present in first webpage
body 0 0 : contains first <p> of second webpage //it should be of the first webpage
body 0 1 : contains second <p> of second webpage //it should be of the first webpage

//and it continues like this

As, you can see the data fetched is stored in incorrect key, value pairs. How do I fix this?

use Promise.all() to send request for multiple URL,

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM