简体   繁体   中英

Extract Text from div using bs4

I've got stuck on a (probably really simple) problem.

I'm scraping a website using python, chrome web driver and selenium.

So I could find the div classes with the information, but i can't extract the text inside.

The following is the code I am using:

 html = driver.page_source
print(html)

soup = bs4(html, "lxml")

#find infos
div = soup.find_all('div', class_="order-line-prod-material ng-binding")
div

and then the output is:

[<div class="order-line-prod-material ng-binding">AQ4174-010</div>,
 <div class="order-line-prod-material ng-binding">AQ4176-010</div>,
 <div class="order-line-prod-material ng-binding">AT7899-010</div>,
 <div class="order-line-prod-material ng-binding">AT7900-010</div>,
 <div class="order-line-prod-material ng-binding">AT7975-010</div>,
 <div class="order-line-prod-material ng-binding">AT8120-010</div>,
 <div class="order-line-prod-material ng-binding">AT8153-010</div>]

when i tried to use:

div.text

The error message i get is the following:

ResultSet object has no attribute 'text'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?

So i know i have to use a for loop but the examples I found online usually go something like this:

for a in div.find_all('a'):
    print(a.text)

The thing I don't understand is I don't have an a tag inside the div class, so over what do i have to iterate to get the text I want?

Really appreciate your help.

Have a great day

if there are multiple values returned by find.all,then it should be treated as lists. This should print the text from each div.

#find infos
divs = soup.find_all('div', class_="order-line-prod-material ng-binding")
for div in divs:
    print(div.text)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM