<html>
<head>...</head>
<body>
<iframe id="hiddenFrame" name="hiddenFrame">
#document
<html>
<head>...</head>
<body>...</body>
</html>
</iframe>
</html>
This is structure of website that I want to crawl. I was try to get html inside #document tag, (tried with urllib.request and requests) but I can't get html inside #document..
request result:
<html>
<head>...</head>
<body>
<iframe></iframe>
</body>
</html>
There is nothing in iframe tag. How can I get html inside #document tag?
I usually use selenium to handle these situations. Basically you have to get in the iframe to get the content.
See this question.
Is the iframe didn't have src attribute?
Why not do this:
Firstly, get the page using requests, then get the src attribute in iframe using beautifulsoup4.
After you get the iframe src attribute, do requests for it.
Voila, you will get the page inside the iframe
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.