[英]How to access the iframe #document using puppeteer?
I'm trying to scraping the anime videos page [jkanime], but I'm having problems with the formats mp4 videos since they are in an iframe #document. 我正在试图抓取动漫视频页面[jkanime],但我遇到格式mp4视频的问题,因为它们在iframe #document中。
In chrome dev tool I put the following: $('#jkvideo_html5_api source').src 在chrome dev工具中我添加了以下内容: $('#jkvideo_html5_api source')。src
And the src of the mp4 shows me. mp4的src告诉我。 But I do not know how to apply the query *$('#jkvideo_html5_api source').src * with puppeteer.
但我不知道如何应用查询* $('#jkvideo_html5_api source')。src * with puppeteer。
Now ... what I do want to achieve is how to get the value of _navigationURL, then make request and refer to the mp4 video source. 现在......我想要实现的是如何获取_navigationURL的值,然后发出请求并参考mp4视频源。
Any help will be appreciated.!! 任何帮助将不胜感激。!!
Image 图片
devtool source code section devtool源代码部分
const getAnimeVideo = async (id: string, chapter: number) => {
const BASE_URL = `${url}${id}/${chapter}/` // => https://jkanime.net/tokyo-ghoul/1/
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(BASE_URL);
const elementHandle = await page.$('.player_conte')
const frame = await elementHandle.contentFrame();
const $ = cheerio.load(`${frame}`);
console.log(frame)
}
Part of the Output Obtained 获得部分产出
....
OMWorld {
_frameManager:
FrameManager {
_events: [Object],
_eventsCount: 3,
_maxListeners: undefined,
_client: [CDPSession],
_page: [Page],
_networkManager: [NetworkManager],
_timeoutSettings: [TimeoutSettings],
_frames: [Map],
_contextIdToContext: [Map],
_isolatedWorlds: [Set],
_mainFrame: [Frame] },
_frame: [Circular],
_timeoutSettings:
TimeoutSettings { _defaultTimeout: null, _defaultNavigationTimeout: null }, _documentPromise: null,
_contextResolveCallback: null,
_contextPromise: Promise { [ExecutionContext] },
_waitTasks: Set {},
_detached: false },
_childFrames: Set {},
_name: '',
_navigationURL:
'https://jkanime.net/um.php?e=Q0VxeUQ2MmZRRlNWeUdHKzdoWlJQOGFLNjFRUnljVkFTaEtFMElZUjFmTlRPQnhnUUtqbnRodjhEVHlGYnVleWJsdnNnRy9wNzVLd0MrMURuRVBKV0tQZjVuT0tIblc3cUNmZDNzdFVFaEE9OjrIf8cc_60GOGTTN7Th9Q_a' }
Output that I want to obtain 我想要获得的输出
{
"src": [
"https://storage.googleapis.com/markesito.appspot.com/tokgho/01.mp4"
]
}
Problem solved: 11:34am 问题解决了:上午11:34
const getAnimeVideo = async (id: string, chapter: number) => {
const BASE_URL = `${url}${id}/${chapter}/` // => https://jkanime.net/tokyo-ghoul/1/
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(BASE_URL);
const elementHandle = await page.$('.player_conte')
const frame = await elementHandle.contentFrame();
const video = await frame.$eval('#jkvideo_html5_api', el =>
Array.from(el.getElementsByTagName('source')).map(e => e.getAttribute("src")));
return video;
}
const getAnimeVideo = async (id: string, chapter: number) => {
const BASE_URL = `${url}${id}/${chapter}/` // => https://jkanime.net/tokyo-ghoul/1/
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(BASE_URL);
const elementHandle = await page.$('.player_conte')
const frame = await elementHandle.contentFrame();
const video = await frame.$eval('#jkvideo_html5_api', el =>
Array.from(el.getElementsByTagName('source')).map(e => e.getAttribute("src")));
return video;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.