简体   繁体   English

如何在 Playwright JS 测试中基于类循环遍历多个元素?

[英]How to loop through multiple elements based on class in Playwright JS test?

I am trying to write a Playwright JS test to scrape some values from a website.我正在尝试编写一个 Playwright JS 测试来从网站上抓取一些值。

Here is the HTML of the page I am trying to scrape:这是我要抓取的页面的 HTML:

<div class="pl-it">
   <div class="i_d">
      <dl>
         <dt class>Series:
         <dd>
            <a href=".....">Province A</a>
         </dd>
         </dt>
         <dt class>Catalog Codes:
         <dd>
            <strong>Mi</strong>
            <strong>CA 1x,</strong>
            "ca 17"
         </dd>
         </dt>
         <dt class>Variants:
         <dd><strong><a>Click to see variants</a></strong></dd>
         </dt>
      </dl>
   </div>
   <div class="i_d">
      <dl>
         <dt class>Series:
         <dd>
            <a href=".....">Province B</a>
         </dd>
         </dt>
         <dt class>Catalog Codes:
         <dd>
            <strong>Fu</strong>
            <strong>DE 2x,</strong>
            "pa 21"
         </dd>
         </dt>
         <dt class>Variants:
         <dd><strong><a>Click to see variants</a></strong></dd>
         </dt>
      </dl>
   </div>
</div>

As you can see, there are multiple divs that have class i_d , and inside those there are multiple dl tags.如您所见,有多个具有类i_d的 div,其中有多个dl标签。

Inside each dl tag, there is a pair of dt & dd tags.在每个dl标签内,都有一对dt & dd标签。

Basically, what I am trying to do is log each dt value & each corresponding dd value to the console.基本上,我要做的是将每个dt值和每个相应的dd值记录到控制台。

The final outcome should look something like this in the logs:最终结果在日志中应如下所示:

Series: Province A
Catalog Codes: Mi CA 1x, ca17
Variants: CLick to see variants

Series: Province B
Catalog Codes: Fu DE 2x, pa21
Variants: CLick to see variants

Below is my current output:以下是我当前的输出:

[
  {
    label: 'Series:',
    name: 'Province A'
  },
{
    label: 'Series:',
    name: 'Province B'
  },
]

As you can see, it is only printing out the first dt & dd values, not the remaining ones (ie Catalog Codes , etc.)如您所见,它只打印出第一个dtdd值,而不是其余的(即Catalog Codes等)

Here is my current Playwright JS code:这是我当前的 Playwright JS 代码:

const { test, expect } = require('@playwright/test');

test('homepage has Playwright in title and get started link linking to the intro page', async ({ page }) => {
  await page.goto('https://colnect.com/en/stamps/list/country/38-Canada');

  await expect(page.locator('div#pageContent h1')).toContainText('Stamp catalog › Canada › Stamps')


  const books = await page.$$eval('div.i_d', all_items => {
    const data =[];
    all_items.forEach(book => {
      const label = book.querySelector('dt')?.innerText;
      const name = book.querySelector('dd')?.innerText;
      data.push({ label, name});
    })
    return data;
  });  
  console.log(books);
});

Can someone please tell me how I can access each dt & dd rather than just the first one in each group?有人可以告诉我如何访问每个dtdd而不仅仅是每个组中的第一个吗?

The dom is a little complicated and had to use nested loops to get the format you are looking for. dom 有点复杂,必须使用嵌套循环来获取您正在寻找的格式。

test.describe('Scrap', async () => {
  test('Stamps', async ({ page }) => {
    await page.goto('https://colnect.com/en/stamps/list/country/38-Canada');
    await page.waitForLoadState('networkidle');
    await expect(page.locator('div#pageContent h1')).toContainText('Stamp catalog › Canada › Stamps');

    const scrappedStampData = await page.$$eval('div.i_d', (stamps) => {
      let stampsArray = [];
      let stampObject = {};
      stamps.forEach(async (stamp) => {
        stamp.querySelectorAll('dt').forEach((row) => {
          const rowLabel = row.innerText;
          const rowValue = row.nextElementSibling.innerText;
          stampObject[rowLabel] = rowValue;
        });
        stampsArray.push(stampObject);
        stampObject = {};
      });
      return stampsArray;
    });
    scrappedStampData.forEach((stampData, ind) => {
      console.log(`\n**************Stamp: ${ind + 1}*****************\n`);
      for (var key in stampData) {
        console.log(key + ' ' + stampData[key]);
      }
    });
  });
});

Output:输出:

**************Stamp: 1*****************

Series: Province of Canada Pence Issue (imperforate)
Catalog codes: Mi:CA 1x, Sn:CA 8, Yt:CA 4, Sg:CA 17
Variants: Click to see variants
Themes: Crowns and Coronets | Famous People | Heads of State | Queens | Royalty | Women
Issued on: 1857-08-01
Colors: Rose
Printers: Rawdon, Wright, Hatch & Edson
Format: Stamp
Emission: Definitive
Perforation: Imperforate
Printing: Recess
Paper: machine-made medium to thick wove
Face value: ½ d - Canadian penny
Print run: 2,600,000
Score: 95% Accuracy: High
Buy Now: Find similar items on eBay

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 剧作家 - 查找多个元素或 class 个名称 - Playwright - Find multiple elements or class names Playwright JS:如何在 Jest 测试中为失败的测试用例截屏 - Playwright JS: How to take screenshot for failed test cases in jest test js中通过for循环创建元素 - Creating elements through a for loop in js 如何在控制台中发送命令,来自测试 Playwright js - how to send command in console, from test Playwright js Playwright JS:如何将我的测试文件/代码/目录从 Jest 测试运行程序转换为 Playwright 测试运行程序? - Playwright JS: How do I convert my test files/code/directory from Jest test runner to Playwright test runner? 在执行代码之前,如何遍历某些类的所有元素? - How to loop through all elements of certain class BEFORE executing a code? jQuery函数循环遍历元素并根据顺序设置类名称,与具有特定类名称的元素相比 - jQuery function to loop through elements and set class names based on order compared to an element with a specific class name 如何在通过循环获得的元素的点击事件上绑定 class? - How to bind a class on click event of elements obtained through a loop? 如何遍历数组并在jquery中单击时打印出带有类的元素? - How to loop through array and print out elements with a class on click in jquery? 如何遍历和切换具有相同类/ VanillaJS的元素? - How do I loop through and toggle elements with the same class / VanillaJS?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM