简体   繁体   中英

Puppeteer iterate table cells and click specific cells

I wish to iterate over a table (a calendar table) in Puppeteer and click specific cells (dates) to toggle their status (to "AWAY").

I've included a snippet of the table below. Each td cell contains two child divs, one with the day number ( <div class="day_num"> ) and another if it has been marked as "AWAY" ( <div class="day_content"> ).

So far I've been able to scrape the table but that won't allow me to click the actual cells, as scraping just scrapes the table contents into an array.

How can I iterate over all the cells and click specific ones depending on the day number included in the child "day_num" div? For example, I wish to click the td for day 8 in the example below, to toggle it's status.

<table class="calendar">
<tr class="days">

<td class="day">
<div class="day_num">7</div>
<div class="day_content"></div>
</td>
<td class="day">
<div class="day_num">8</div>
<div class="day_content"></div>
</td>
<td class="day">
<div class="day_num">9</div>
<div class="day_content">AWAY</div>
</td>

The scraping code I currently have is:

 const result = await page.evaluate(() => {
    const rows = document.querySelectorAll('.calendar tr td div');
    return Array.from(rows, (row) => {
      const columns = row.querySelectorAll('div');
      return Array.from(columns, (column) => column.innerHTML);
    });
  });

  console.log(result);

result is:

[
  [],           [ '1', '' ],  [ '2', 'AWAY' ],
  [ '3', '' ],  [ '4', '' ],  [ '5', '' ],
  [ '6', '' ],  [ '7', '' ],  [ '8', '' ],
  [ '9', 'AWAY' ],  [ '10', '' ], [ '11', '' ],
  [ '12', '' ], [ '13', '' ], [ '14', '' ],
  [ '15', '' ], [ '16', '' ], [ '17', '' ],
  [ '18', '' ], [ '19', '' ], [ '20', '' ],
  [ '21', '' ], [ '22', '' ], [ '23', '' ],
  [ '24', '' ], [ '25', '' ], [ '26', '' ],
  [ '27', '' ], [ '28', '' ], [ '29', '' ],
  [ '30', '' ], [],           [],
  [],           []
]

While you haven't provided the live page (so I can't verify that arbitrary JS, visibility and timing won't make this fail), I'll take a stab at it and see if the following works, assuming your HTML is pretty much static:

const puppeteer = require("puppeteer"); // ^13.0.1

let browser;
(async () => {
  const html = `
    <body>
    <table class="calendar">
      <tr class="days">
        <td class="day">
          <div class="day_num">7</div>
          <div class="day_content"></div>
        </td>
        <td class="day">
          <div class="day_num">8</div>
          <div class="day_content"></div>
        </td>
        <td class="day">
          <div class="day_num">9</div>
          <div class="day_content">AWAY</div>
        </td>
      </tr>
    </table>
    <script>
      [...document.querySelectorAll(".day_content")][1]
        .addEventListener("click", e => {
          e.target.textContent = "CLICKED";
        })
      ;
    </script>
    </body>
  `;
  browser = await puppeteer.launch({headless: true});
  const [page] = await browser.pages();
  await page.setContent(html);
  const [dayEl] = await page.$x('//div[contains(@class, "day_num") and text()="8"]');
  const dayContent = await dayEl.evaluate(el => {
    const dayContent = el.closest(".day").querySelector(".day_content");
    dayContent.click();
    return dayContent.textContent;
  });
  console.log(dayContent); // => CLICKED
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close())
;

The approach is to find the .day_num element you're interested in using an XPath on the class and text, then pop up the tree to the .day element and down again to the associated .day_content element to click it. I added a listener to change the text upon click to verify that it was indeed clicked.

You could also use nextElementSibling on the .day_num rather than the closest / querySelector combo, but this assumes more about the relationship between the .day_num and .day_content elements and would probably be more brittle.

Also, if the text content "8" might have whitespace, you can loosen it up a bit with substring contains in your XPath. '//div[contains(@class, "day_num") and contains(text(), "8")]' , at the risk of false positives.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM