简体   繁体   中英

Playwright & NodeJs - Read CSV and push data to an array

I am using the playwright library for web scraping and URLs are stored in a CSV file. I am trying to read the CSV file and pick the URLs in an array to utilize in the scraping code.

Here is the code I wrote.

// Support    
const csv = require('csv-parser');
const fs = require('fs');

// Array to store the URL.
var urls = [];
// This prints an empty array.
console.log(urls);

fs.createReadStream('sample.csv')
  .pipe(csv())
  .on('data', (row) => {
    // Trying push the URL in the array
    urls.push(row);

    // This prints the values of URLs
    console.log(urls);
  })
  .on('end', () => {
    console.log('CSV file successfully processed');
  });
// Here I don't see the URLs but an empty array.
console.log("URLS:" + urls);  

In the method ".on('data'" the value gets pushed to the array and the console is also printing those, however, post-execution when I try to get the URLs from the array it returns an empty array.

hey there I hope this helps. Also this is assuming that links are the only thing that are in your CSV file.

const fs = require("fs");

//Reads the CSV file and saves it  
var links = fs.readFileSync('Path/to/csv')
    .toString() // convert Buffer to string
    .split('\n') // split string to lines
    .map(e => e.trim()) // remove white spaces for each line

//start of for loop, to loop through csv file
for (const link of links) {
//normal test set up for playwright. adding the + link.toString to avoid duplicate test name error
    test('test for ' + link.toString(), async ({ page }) => {
//first csv file item sent to console
        console.log(link);
//goes to that csv link item
        await page.goto(link)

//Do whatever else you need

    })
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM