I'm new to web scraping and currently using robobrowser to scrape a webpage. I'm trying to scrape the value of 'aria-label' under a certain class, but don't know how to do.
Here is my code.
from robobrowser import RoboBrowser
browser = RoboBrowser(history=True, parser='html.parser')
browser.open('https://www.scrapingwebsite.com')
links = browser.find_all(class_='searchResult__373c0__1yggB')
for link in links:
print(link.find(class_='big_braket_class').text)
problem_part = link.find(class_='subsidiary_class')
print(problem_part.get('aria-label'))
It simply doesn't work. Is there any way to make it work? Thx
You could dump content from robobrowser into bs4. Then with bs4 4.7.1 use :has and :contains to target required items.
from bs4 import BeautifulSoup
#...your code
soup = browser.parsed
data = [(item.select_one('[class*=businessName]').text.replace('\xa0',''), item.select_one('[class*="i-stars"]')['aria-label']) for item in soup.select('li:has(h3:contains("All Results")) ~ li:has([class*=businessName])')]
print(data)
Sample of results:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.