繁体   English   中英

如何在 Playwright JS 测试中基于类循环遍历多个元素?

[英]How to loop through multiple elements based on class in Playwright JS test?

我正在尝试编写一个 Playwright JS 测试来从网站上抓取一些值。

这是我要抓取的页面的 HTML:

<div class="pl-it">
   <div class="i_d">
      <dl>
         <dt class>Series:
         <dd>
            <a href=".....">Province A</a>
         </dd>
         </dt>
         <dt class>Catalog Codes:
         <dd>
            <strong>Mi</strong>
            <strong>CA 1x,</strong>
            "ca 17"
         </dd>
         </dt>
         <dt class>Variants:
         <dd><strong><a>Click to see variants</a></strong></dd>
         </dt>
      </dl>
   </div>
   <div class="i_d">
      <dl>
         <dt class>Series:
         <dd>
            <a href=".....">Province B</a>
         </dd>
         </dt>
         <dt class>Catalog Codes:
         <dd>
            <strong>Fu</strong>
            <strong>DE 2x,</strong>
            "pa 21"
         </dd>
         </dt>
         <dt class>Variants:
         <dd><strong><a>Click to see variants</a></strong></dd>
         </dt>
      </dl>
   </div>
</div>

如您所见,有多个具有类i_d的 div,其中有多个dl标签。

在每个dl标签内,都有一对dt & dd标签。

基本上,我要做的是将每个dt值和每个相应的dd值记录到控制台。

最终结果在日志中应如下所示:

Series: Province A
Catalog Codes: Mi CA 1x, ca17
Variants: CLick to see variants

Series: Province B
Catalog Codes: Fu DE 2x, pa21
Variants: CLick to see variants

以下是我当前的输出:

[
  {
    label: 'Series:',
    name: 'Province A'
  },
{
    label: 'Series:',
    name: 'Province B'
  },
]

如您所见,它只打印出第一个dtdd值,而不是其余的(即Catalog Codes等)

这是我当前的 Playwright JS 代码:

const { test, expect } = require('@playwright/test');

test('homepage has Playwright in title and get started link linking to the intro page', async ({ page }) => {
  await page.goto('https://colnect.com/en/stamps/list/country/38-Canada');

  await expect(page.locator('div#pageContent h1')).toContainText('Stamp catalog › Canada › Stamps')


  const books = await page.$$eval('div.i_d', all_items => {
    const data =[];
    all_items.forEach(book => {
      const label = book.querySelector('dt')?.innerText;
      const name = book.querySelector('dd')?.innerText;
      data.push({ label, name});
    })
    return data;
  });  
  console.log(books);
});

有人可以告诉我如何访问每个dtdd而不仅仅是每个组中的第一个吗?

dom 有点复杂,必须使用嵌套循环来获取您正在寻找的格式。

test.describe('Scrap', async () => {
  test('Stamps', async ({ page }) => {
    await page.goto('https://colnect.com/en/stamps/list/country/38-Canada');
    await page.waitForLoadState('networkidle');
    await expect(page.locator('div#pageContent h1')).toContainText('Stamp catalog › Canada › Stamps');

    const scrappedStampData = await page.$$eval('div.i_d', (stamps) => {
      let stampsArray = [];
      let stampObject = {};
      stamps.forEach(async (stamp) => {
        stamp.querySelectorAll('dt').forEach((row) => {
          const rowLabel = row.innerText;
          const rowValue = row.nextElementSibling.innerText;
          stampObject[rowLabel] = rowValue;
        });
        stampsArray.push(stampObject);
        stampObject = {};
      });
      return stampsArray;
    });
    scrappedStampData.forEach((stampData, ind) => {
      console.log(`\n**************Stamp: ${ind + 1}*****************\n`);
      for (var key in stampData) {
        console.log(key + ' ' + stampData[key]);
      }
    });
  });
});

输出:

**************Stamp: 1*****************

Series: Province of Canada Pence Issue (imperforate)
Catalog codes: Mi:CA 1x, Sn:CA 8, Yt:CA 4, Sg:CA 17
Variants: Click to see variants
Themes: Crowns and Coronets | Famous People | Heads of State | Queens | Royalty | Women
Issued on: 1857-08-01
Colors: Rose
Printers: Rawdon, Wright, Hatch & Edson
Format: Stamp
Emission: Definitive
Perforation: Imperforate
Printing: Recess
Paper: machine-made medium to thick wove
Face value: ½ d - Canadian penny
Print run: 2,600,000
Score: 95% Accuracy: High
Buy Now: Find similar items on eBay

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM