简体   繁体   中英

how to get text inside div in puppeteer

const puppeteer = require("puppeteer");

(async function main() {
    try {
        const browser = await puppeteer.launch({headless: false});
        const page = await browser.newPage();
        page.setUserAgent("Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36");

        await page.goto("https://www.qimai.cn/rank/index/brand/all/genre/6014/device/iphone/country/us/date/2019-03-19", {waitUntil: 'load', timeout: 0});
        await page.waitForSelector(".container");
        const sections = await page.$$(".container");

        const freeButton = await page.$('[href="/rank/index/brand/free/device/iphone/country/us/genre/6014/date/2019-03-19"]');
        await freeButton.click();


        // free list

        const appTable = await page.waitForSelector(".data-table");
        const lis = await page.$$(".data-table > tbody > tr > td");

        // go to app content
        const appInfo = await page.$("a.icon");
        // appInfo.click();

        for (const content of lis) {
            const name = await content.$("div.appname");
            const gameName = await page.evaluate(name => name.innerText, name);
            console.log("Game Name: ", gameName);
        }

        console.log("-- bingo --");

    } catch (e) {
        console.log("our error", e);
    }
})();

I cant seem to get the text from, and im getting this error: TypeError: Cannot read property 'innerHTML' of null. I have tried all ways, but its not working. This is the link to the website. https://www.qimai.cn/app/rank/appid/1451505313/country/us

I use "waitForSelector" method and after that try to get the text

await page.waitForSelector('your selector')
let element = await page.$('your selector')
let value = await page.evaluate(el => el.textContent, element)

The easiest way that I have managed to retrieve values from DOM selections with Puppeteer and jest is using the eval method.

Let's say I want the text value from a span.

// markup
<div class="target-holder">
    <span class="target">test</span>
</div>

// inside my e2e test file
const spanVal =  await page.$eval('.target-holder .target', el => el.innerText);

console.log(spanVal); // test

Official documentation link: https://pptr.dev/#?product=Puppeteer&version=v2.1.1&show=api-pageevalselector-pagefunction-args

using waitForSelector and evaluate this becomes pretty clean

const element = await page.waitForSelector('your selector'); // select the element
const value = await element.evaluate(el => el.textContent); // grab the textContent from the element, by evaluating this function in the browser context

If you're getting elements by XPath, just use the code above.

<span class="toggleable"> Random text.</span> 
// right click on this element -> copy -> copy XPath

const element = await page.$x('//thecopiedxpath');
const textObject = await element[0].getProperty('textContent');
const text = textObject._remoteObject.value;
console.log(text);

That will print the message "Random Text".

From the documentation :

const tweetHandle = await page.$('.tweet .retweets');
expect(await tweetHandle.evaluate(node => node.innerText)).toBe('10');

If your goal is to receive text, you can make workaround with JS in DOM page.
Change this:

const lis = await page.$$(".data-table > tbody > tr > td");

const appInfo = await page.$("a.icon");

for (const content of lis) {
  const name = await content.$("div.appname");
  const gameName = await page.evaluate(name => name.innerText, name);
  console.log("Game Name: ", gameName);
}

To this:

const appInfo = await page.$("a.icon");

const texts = await page.evaluate(() => {
  const textsToReturn = [];

  const elems = Array.from(document.querySelectorAll('.data-table > tbody > tr > td'));

  for (const el of elems) {
   textsToReturn.push(el.querySelector('div.appname').innerText)
  }

  // If I'm not mistaken, puppeteer doesn't allow to return complicated data structures, so we'll stringify
  return JSON.stringify(textsToReturn)
})

// And here is your game names
console.log('Game names', JSON.parse(texts));

NB: This code hasn't been tested on actual html page since there is no example.
But, you should get the concept of how to reimplement puppeteer logic with DOM native methods, to achieve the goal.

//get the xpath of the element
const getXpathOfRecordLabel = await page.$x('//div');

//get the property of textContent
const getTheProperty = await getXpathOfRecordLabel[0].getProperty(
  'textContent'
);

//get the value
const getRecordName = getTheProperty._remoteObject.value;
console.log(getRecordName);

Changing DOM through direct call is not desirable on front-end frameworks such as Angular, because these frameworks need the full control over DOM in order to work properly. However, manipulating DOM directly may cause unwanted errors or behaviors.

Long story short, don't use:
await element.evaluate(el => el.textContent); for Angular and such front-end frameworks/libraries. Use this instead:

await page.click("input[name=email]", {clickCount: 3})
await page.type("input[name=inputName]", "Input text")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM