简体   繁体   中英

JavaScript - populate object constructor from multiple arrays

I'm collecting multiple innerText-Properties from a website that repeats elements on it's page (24 university profiles with name, avg rating, number of programms and so on).

I tested my little program with one university using querySelector() to collect the 4-5 innerText that I wanted, brought them together using var u = await Promise.allSettled([arr1, arr2, arr3, arr4, arr5]) and used a constructor I defined at the top var currUniv = new University(...myArrayOfFacts) . So far so good (at least the result...)

Since the page offers 24 university itmes at once / on one page (and all in the same structure), I now want to used querySelectorAll() to grab 5 arrays with 24 elements each in one go. If I stick to var u = await Promise.allSettled([arr1, arr2, arr3, arr4, arr5]) I end up with an array of 5 arrays and now don't know (and can't seem to find a way to successful google it) how I feed one element of each array at a time to my constructor.

Should I avoid stuffing everything in one large array in the first place? I do this because I think I need to await all Promises to resolve... Or at what point should I start looping over the arrays?

Everything is async. I shortend the code a bit: And like I wrote further up - that worked fine for one set of DOM-Elements / for one university.

Many thanks for any tips pointing me in the right direction!

const puppeteer = require('./node_modules/puppeteer');

const startUrl = "https://www.studycheck.de/hochschulen/";

//constructor - shortend
function HSMain(name, ...){
      this.nameHS = name;
      this...
}

const hsfPageVisits = async () => {

  try{
    const browser = await puppeteer.launch({headless: true});
    const page = await browser.newPage();
    await page.goto(startUrl, {waitUntil: 'domcontentloaded'});

   // get first element (name)
      var nameHS = await page.evaluate(() => {
        let name = Array.from(document.querySelectorAll('div .title a')).map(node => node.innerText);
        return name;
      });
   // get second element (rating)
      var rating = await page.evaluate(() => {
        let rate = Array.from(document.querySelectorAll('div .rating-container > div .rating-value')).map(node => node.innerText.trim());
      return rate;
      });
[...more DOM - elements...]

// wait for all promises to resolve
var univArr = await Promise.allSettled([nameHS, rating, ..., ..., ...]);

// spread the array into the object constructor
var myObj = await new HSMain(...univArr);

  await browser.close();
  }
  catch(e){
    console.log("error", e);
  }
};
hsfPageVisits();

So what you have is an array of names and another for ratings and so on. Each index of those arrays corresponds to the same university, so just map one of those arrays and use the index provided by map to get the value from the rest of the arrays, unfortunately you can't use the spread syntax here:

let universities = nameHS.map((name, i) =>
    new University(name, rating[i], theNextArray[i], theArrayAfterThat[i], ...)
);

My approach would be to get all the "facts" at once for each university instead of separately in different arrays. Every university will have its facts grouped in an object or array, that will shorten the code drastically like so:

try {
    const browser = await puppeteer.launch({ headless: true });
    const page = await browser.newPage();
    await page.goto(startUrl, { waitUntil: 'domcontentloaded' });

    var universitiesFacts = await page.evaluate(() => {
        let universities = Array.from(document.querySelectorAll(".institute-item")); // first get all university (each university info is contained in an element with a class 'institute-item')
        
        return universities.map(university => [                                      // for each .institute-item element
            university.querySelector(".title a").textContent.trim(),                 // get the name (using querySelector on the .institute-item element)
            university.querySelector(".rating-value").textContent.trim(),            // get the rating
            // ... the rest of facts for the current university
        ]);
    });

    let universities = universitiesFacts.map(facts => new University(...facts));     // now we can use the spread syntax
  
    await browser.close();
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM