繁体   English   中英

尝试使用硒和美丽的汤创建网络刮板

[英]Trying to create a web scraper using selenium and beautiful soup

我正在尝试为一个朋友制作一个网络爬虫,通过 Indeed.com 进行解析以帮助招聘。 基本上他想抓取页面并创建一个包含所有职位列表的 csv 文件,包括职位、公司和职位描述。 我目前卡住了,这就是我所拥有的:

from selenium import webdriver
from bs4 import BeautifulSoup  
import requests

driver = webdriver.Chrome()
driver.get('https://indeed.com')
searchBox = driver.find_element_by_xpath('//*[@id="text-input-what"]')
searchBox.send_keys('Social Media Marketing')

searchButton  = driver.find_element_by_xpath('//*[@id="whatWhereFormId"]/div[3]/button')
searchButton.click()

import time
time.sleep(5)

popUpButton = driver.find_element_by_xpath('//*[@id="popover-x"]/button/svg')
pupUpButton.click()



print(driver.current_url)


r = requests.get(driver.current_url)
print(r.text[0:10500])

soup = BeautifulSoup(r.text, 'html.parser') 

results = soup.find_all('span', attrs={'class':'company'})

关于我应该如何解决这个问题的任何建议?

要获取职位、公司和描述,您可以使用以下示例:

import requests
from bs4 import BeautifulSoup


url = 'https://www.indeed.com/jobs'
params = {
  'q': 'social media marketing',
  'l': ''
}

soup = BeautifulSoup(requests.get(url, params=params).content, 'html.parser')

for title in soup.select('h2.title a'):
    print(title.get_text(strip=True, separator=' '))
    print(title.find_next('span', class_='company').get_text(strip=True))
    print(title.find_next('div', class_='summary').get_text(strip=True, separator=' '))
    print('-' * 80)

印刷:

Social Media Assistant
iHeartMedia, Inc.
Track social media influence measurements. Minimum of one-year experience with social media or digital marketing ; experience in entertainment or music space…
--------------------------------------------------------------------------------
Social Media Coordinator (Entertainment)
Valnet Freelance
Good understanding and experience of social media management. As a Social Media Coordinator, you'll be responsible for sparking and maintaining discussions with…
--------------------------------------------------------------------------------
Social Media / Marketing Coordinator
Five Star Ford Lewisville
Passion for social media and marketing. Developing engagement through all social media avenues. Stay up-to-date with current technologies and trends in social…
--------------------------------------------------------------------------------

...and so on.

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM