[英]How to loop through multiple elements based on class in Playwright JS test?
我正在尝试编写一个 Playwright JS 测试来从网站上抓取一些值。
这是我要抓取的页面的 HTML:
<div class="pl-it">
<div class="i_d">
<dl>
<dt class>Series:
<dd>
<a href=".....">Province A</a>
</dd>
</dt>
<dt class>Catalog Codes:
<dd>
<strong>Mi</strong>
<strong>CA 1x,</strong>
"ca 17"
</dd>
</dt>
<dt class>Variants:
<dd><strong><a>Click to see variants</a></strong></dd>
</dt>
</dl>
</div>
<div class="i_d">
<dl>
<dt class>Series:
<dd>
<a href=".....">Province B</a>
</dd>
</dt>
<dt class>Catalog Codes:
<dd>
<strong>Fu</strong>
<strong>DE 2x,</strong>
"pa 21"
</dd>
</dt>
<dt class>Variants:
<dd><strong><a>Click to see variants</a></strong></dd>
</dt>
</dl>
</div>
</div>
如您所见,有多个具有类i_d
的 div,其中有多个dl
标签。
在每个dl
标签内,都有一对dt
& dd
标签。
基本上,我要做的是将每个dt
值和每个相应的dd
值记录到控制台。
最终结果在日志中应如下所示:
Series: Province A
Catalog Codes: Mi CA 1x, ca17
Variants: CLick to see variants
Series: Province B
Catalog Codes: Fu DE 2x, pa21
Variants: CLick to see variants
以下是我当前的输出:
[
{
label: 'Series:',
name: 'Province A'
},
{
label: 'Series:',
name: 'Province B'
},
]
如您所见,它只打印出第一个dt
和dd
值,而不是其余的(即Catalog Codes
等)
这是我当前的 Playwright JS 代码:
const { test, expect } = require('@playwright/test');
test('homepage has Playwright in title and get started link linking to the intro page', async ({ page }) => {
await page.goto('https://colnect.com/en/stamps/list/country/38-Canada');
await expect(page.locator('div#pageContent h1')).toContainText('Stamp catalog › Canada › Stamps')
const books = await page.$$eval('div.i_d', all_items => {
const data =[];
all_items.forEach(book => {
const label = book.querySelector('dt')?.innerText;
const name = book.querySelector('dd')?.innerText;
data.push({ label, name});
})
return data;
});
console.log(books);
});
有人可以告诉我如何访问每个dt
和dd
而不仅仅是每个组中的第一个吗?
dom 有点复杂,必须使用嵌套循环来获取您正在寻找的格式。
test.describe('Scrap', async () => {
test('Stamps', async ({ page }) => {
await page.goto('https://colnect.com/en/stamps/list/country/38-Canada');
await page.waitForLoadState('networkidle');
await expect(page.locator('div#pageContent h1')).toContainText('Stamp catalog › Canada › Stamps');
const scrappedStampData = await page.$$eval('div.i_d', (stamps) => {
let stampsArray = [];
let stampObject = {};
stamps.forEach(async (stamp) => {
stamp.querySelectorAll('dt').forEach((row) => {
const rowLabel = row.innerText;
const rowValue = row.nextElementSibling.innerText;
stampObject[rowLabel] = rowValue;
});
stampsArray.push(stampObject);
stampObject = {};
});
return stampsArray;
});
scrappedStampData.forEach((stampData, ind) => {
console.log(`\n**************Stamp: ${ind + 1}*****************\n`);
for (var key in stampData) {
console.log(key + ' ' + stampData[key]);
}
});
});
});
输出:
**************Stamp: 1*****************
Series: Province of Canada Pence Issue (imperforate)
Catalog codes: Mi:CA 1x, Sn:CA 8, Yt:CA 4, Sg:CA 17
Variants: Click to see variants
Themes: Crowns and Coronets | Famous People | Heads of State | Queens | Royalty | Women
Issued on: 1857-08-01
Colors: Rose
Printers: Rawdon, Wright, Hatch & Edson
Format: Stamp
Emission: Definitive
Perforation: Imperforate
Printing: Recess
Paper: machine-made medium to thick wove
Face value: ½ d - Canadian penny
Print run: 2,600,000
Score: 95% Accuracy: High
Buy Now: Find similar items on eBay
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.