[英]Get title from newly opened page puppeteer
I am trying to get the a new tab and scrape the title of that page with puppeteer
. 我正试图获得一个新标签,并用
puppeteer
刮掉该页面的标题。
This is what I have 这就是我所拥有的
// use puppeteer
const puppeteer = require('puppeteer');
//set wait length in ms: 1000ms = 1sec
const short_wait_ms = 1000
async function run() {
const browser = await puppeteer.launch({
headless: false, timeout: 0});
const page = await browser.newPage();
await page.goto('https://biologyforfun.wordpress.com/2017/04/03/interpreting-random-effects-in-linear-mixed-effect-models/');
// second page DOM elements
const CLICKHERE_SELECTOR = '#post-2068 > div > div.entry-content > p:nth-child(2) > a:nth-child(1)';
// main page
await page.waitFor(short_wait_ms);
await page.click(CLICKHERE_SELECTOR);
// new tab opens - move to new tab
let pages = await browser.pages();
//go to the newly opened page
//console.log title -- Generalized Linear Mixed Models in Ecology and in R
}
run();
I can't figure out how to use browser.page()
to start working on the new page. 我无法弄清楚如何使用
browser.page()
开始处理新页面。
According to the Puppeteer Documentation : 根据Puppeteer文档 :
page.title()
页面标题()
Shortcut for
page.mainFrame().title()
.page.mainFrame().title()
快捷方式。
Therefore, you should use page.title()
for getting the title of the newly opened page. 因此,您应该使用
page.title()
来获取新打开页面的标题。
Alternatively, you can gain a slight performance boost by using the following: 或者,您可以通过使用以下内容获得轻微的性能提升:
page._frameManager._mainFrame.evaluate(() => document.title)
Note: Make sure to use the
await
operator when callingpage.title()
, as the title tag must be downloaded before Puppeteer can access its content.注意:确保在调用
page.title()
时使用await
运算符,因为必须先下载标题标记,然后Puppeteer才能访问其内容。
You shouldn't need to move to the new tab. 您不需要移动到新选项卡。
To get the title of any page you can use: 要获取您可以使用的任何页面的标题:
const pageTitle = await page.title();
Also after you click something and you're waiting for the new page to load you should wait for the load event or the network to be Idle: 此外,在您单击某些内容并等待加载新页面后,您应该等待加载事件或网络处于空闲状态:
// Wait for redirection
await page.waitForNavigation({waitUntil: 'networkidle', networkIdleTimeout: 1000});
Check the docs: https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagewaitfornavigationoptions 查看文档: https : //github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagewaitfornavigationoptions
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.