[英]Trying to create a web scraper using selenium and beautiful soup
我正在尝试为一个朋友制作一个网络爬虫,通过 Indeed.com 进行解析以帮助招聘。 基本上他想抓取页面并创建一个包含所有职位列表的 csv 文件,包括职位、公司和职位描述。 我目前卡住了,这就是我所拥有的:
from selenium import webdriver
from bs4 import BeautifulSoup
import requests
driver = webdriver.Chrome()
driver.get('https://indeed.com')
searchBox = driver.find_element_by_xpath('//*[@id="text-input-what"]')
searchBox.send_keys('Social Media Marketing')
searchButton = driver.find_element_by_xpath('//*[@id="whatWhereFormId"]/div[3]/button')
searchButton.click()
import time
time.sleep(5)
popUpButton = driver.find_element_by_xpath('//*[@id="popover-x"]/button/svg')
pupUpButton.click()
print(driver.current_url)
r = requests.get(driver.current_url)
print(r.text[0:10500])
soup = BeautifulSoup(r.text, 'html.parser')
results = soup.find_all('span', attrs={'class':'company'})
关于我应该如何解决这个问题的任何建议?
要获取职位、公司和描述,您可以使用以下示例:
import requests
from bs4 import BeautifulSoup
url = 'https://www.indeed.com/jobs'
params = {
'q': 'social media marketing',
'l': ''
}
soup = BeautifulSoup(requests.get(url, params=params).content, 'html.parser')
for title in soup.select('h2.title a'):
print(title.get_text(strip=True, separator=' '))
print(title.find_next('span', class_='company').get_text(strip=True))
print(title.find_next('div', class_='summary').get_text(strip=True, separator=' '))
print('-' * 80)
印刷:
Social Media Assistant
iHeartMedia, Inc.
Track social media influence measurements. Minimum of one-year experience with social media or digital marketing ; experience in entertainment or music space…
--------------------------------------------------------------------------------
Social Media Coordinator (Entertainment)
Valnet Freelance
Good understanding and experience of social media management. As a Social Media Coordinator, you'll be responsible for sparking and maintaining discussions with…
--------------------------------------------------------------------------------
Social Media / Marketing Coordinator
Five Star Ford Lewisville
Passion for social media and marketing. Developing engagement through all social media avenues. Stay up-to-date with current technologies and trends in social…
--------------------------------------------------------------------------------
...and so on.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.