I wish to iterate over a table (a calendar table) in Puppeteer and click specific cells (dates) to toggle their status (to "AWAY").
I've included a snippet of the table below. Each td cell contains two child divs, one with the day number ( <div class="day_num">
) and another if it has been marked as "AWAY" ( <div class="day_content">
).
So far I've been able to scrape the table but that won't allow me to click the actual cells, as scraping just scrapes the table contents into an array.
How can I iterate over all the cells and click specific ones depending on the day number included in the child "day_num"
div? For example, I wish to click the td for day 8 in the example below, to toggle it's status.
<table class="calendar">
<tr class="days">
<td class="day">
<div class="day_num">7</div>
<div class="day_content"></div>
</td>
<td class="day">
<div class="day_num">8</div>
<div class="day_content"></div>
</td>
<td class="day">
<div class="day_num">9</div>
<div class="day_content">AWAY</div>
</td>
The scraping code I currently have is:
const result = await page.evaluate(() => {
const rows = document.querySelectorAll('.calendar tr td div');
return Array.from(rows, (row) => {
const columns = row.querySelectorAll('div');
return Array.from(columns, (column) => column.innerHTML);
});
});
console.log(result);
result is:
[
[], [ '1', '' ], [ '2', 'AWAY' ],
[ '3', '' ], [ '4', '' ], [ '5', '' ],
[ '6', '' ], [ '7', '' ], [ '8', '' ],
[ '9', 'AWAY' ], [ '10', '' ], [ '11', '' ],
[ '12', '' ], [ '13', '' ], [ '14', '' ],
[ '15', '' ], [ '16', '' ], [ '17', '' ],
[ '18', '' ], [ '19', '' ], [ '20', '' ],
[ '21', '' ], [ '22', '' ], [ '23', '' ],
[ '24', '' ], [ '25', '' ], [ '26', '' ],
[ '27', '' ], [ '28', '' ], [ '29', '' ],
[ '30', '' ], [], [],
[], []
]
While you haven't provided the live page (so I can't verify that arbitrary JS, visibility and timing won't make this fail), I'll take a stab at it and see if the following works, assuming your HTML is pretty much static:
const puppeteer = require("puppeteer"); // ^13.0.1
let browser;
(async () => {
const html = `
<body>
<table class="calendar">
<tr class="days">
<td class="day">
<div class="day_num">7</div>
<div class="day_content"></div>
</td>
<td class="day">
<div class="day_num">8</div>
<div class="day_content"></div>
</td>
<td class="day">
<div class="day_num">9</div>
<div class="day_content">AWAY</div>
</td>
</tr>
</table>
<script>
[...document.querySelectorAll(".day_content")][1]
.addEventListener("click", e => {
e.target.textContent = "CLICKED";
})
;
</script>
</body>
`;
browser = await puppeteer.launch({headless: true});
const [page] = await browser.pages();
await page.setContent(html);
const [dayEl] = await page.$x('//div[contains(@class, "day_num") and text()="8"]');
const dayContent = await dayEl.evaluate(el => {
const dayContent = el.closest(".day").querySelector(".day_content");
dayContent.click();
return dayContent.textContent;
});
console.log(dayContent); // => CLICKED
})()
.catch(err => console.error(err))
.finally(() => browser?.close())
;
The approach is to find the .day_num
element you're interested in using an XPath on the class and text, then pop up the tree to the .day
element and down again to the associated .day_content
element to click it. I added a listener to change the text upon click to verify that it was indeed clicked.
You could also use nextElementSibling
on the .day_num
rather than the closest
/ querySelector
combo, but this assumes more about the relationship between the .day_num
and .day_content
elements and would probably be more brittle.
Also, if the text content "8"
might have whitespace, you can loosen it up a bit with substring contains in your XPath. '//div[contains(@class, "day_num") and contains(text(), "8")]'
, at the risk of false positives.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.