简体   繁体   中英

Find all elements on a web page using Selenium and Python

I am trying to go through a webpage with Selenium and create a set of all elements with certain class names, so I have been using:

elements = set(driver.find_elements_by_class_name('class name'))

However, in some cases there are thousands of elements on the page (if I scroll down), and I've noticed that this code only finds the first 18-20 elements on the page (only about 14-16 are visible to me at once). Do I need to scroll, or am I doing something else wrong? Is there any way to instantaneously get all of the elements I want in the HTML into a list without having to visually see them on the screen?

It depends on your webpage. Just look at the HTML source code (or the network log), before you scroll down. If there are just the 18-20 elements then the page lazy load the next items (eg Twitter or Instagram). This means, the server just renders the next items if you reached a certain point on the webpage. Otherwise all thousand items would be loaded, which would increase the page size, loading time and server load.

In this case, you have to scroll down until the end and then get the source code to parse all items.

Probably you can use more advanced methods like dealing with each chunk as a kind of page for a pagination method (eg not saying "go to next page" but saying "scroll down"). But I guess you're a beginner, so I would start with simple scrolling down to the end (eg scroll, waiting, scroll,... until there are no new elements), then fetching the HTML and then parsing it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM