I am trying to parse the HTML on this website .
I would like to get the text from all these span
elements with class = "post-subject"
Examples:
<span class="post-subject">Set of 20 moving boxes (20009 or 20011)</span>
<span class="post-subject">Firestick/Old xbox games</span>
When I run my code below, soup.find()
returns None
. I'm not sure what's going on?
import requests
from bs4 import BeautifulSoup
page = requests.get('https://trashnothing.com/washington-dc-freecycle?page=1')
soup = BeautifulSoup(page.text, 'html.parser')
soup.find('span', {'class': 'post-subject'})
To help you get started the following should load the page you will need to get the correct gecko driver and then can implement with Selenium. I do not see a class: post-subject on that page you linked, but you can automate button clicks for the login as :
availbutton = driver.find_element_by_id('buttonAvailability_1')
availbutton.click()
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Firefox()
driver.get('https://trashnothing.com/washington-dc-freecycle?page=1')
html = driver.page_source
soup = BeautifulSoup(html,'lxml')
print(soup.find('span', {'class': 'post-subject'}))
I had the same issue. Just changed the html.parser
to html5lib
and boom. It was working then. Also its a good practice to use soup.find_all()
instead of soup.find()
as the function return more than one object
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.