简体   繁体   中英

Beautiful Soup find() returns None?

I am trying to parse the HTML on this website .

I would like to get the text from all these span elements with class = "post-subject"

Examples:

<span class="post-subject">Set of 20 moving boxes (20009 or 20011)</span>

<span class="post-subject">Firestick/Old xbox games</span>

When I run my code below, soup.find() returns None . I'm not sure what's going on?

import requests
from bs4 import BeautifulSoup


page = requests.get('https://trashnothing.com/washington-dc-freecycle?page=1')
soup = BeautifulSoup(page.text, 'html.parser')

soup.find('span', {'class': 'post-subject'})

To help you get started the following should load the page you will need to get the correct gecko driver and then can implement with Selenium. I do not see a class: post-subject on that page you linked, but you can automate button clicks for the login as :

availbutton = driver.find_element_by_id('buttonAvailability_1')
availbutton.click()


from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Firefox()
driver.get('https://trashnothing.com/washington-dc-freecycle?page=1')

html = driver.page_source
soup = BeautifulSoup(html,'lxml')
print(soup.find('span', {'class': 'post-subject'}))

I had the same issue. Just changed the html.parser to html5lib and boom. It was working then. Also its a good practice to use soup.find_all() instead of soup.find() as the function return more than one object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM