I would like to write some code that scrapes data from multiple pages in a job listing site. Currently however, when I run my code I only get the last page as opposed to a listing of all the pages I scraped.
This is my code
url = 'https://ng.indeed.com/jobs?q=Business+Intelligence+Analyst&l=Nigeria&start='
for i in range(0,80,10):
page = requests.get(url+str(i))
soup = BeautifulSoup(page.text, 'html.parser')
jobs = []
for div in soup.find_all(name='div',attrs={'class':'row'}):
for a in div.find_all(name='a', attrs={'data-tn-element':'jobTitle'}):
jobs.append(a['title'])
summaries = []
divs = soup.findAll('div', attrs={'class':'summary'})
for d in divs:
summaries.append(d.text.strip())
jobs = pd.DataFrame(
{'title': extract_title(soup),
'summary': extract_summary(soup)
})
jobs
I use the first for loop to iterate through each page (page 2 = 10, 3=20 etc). The ideal output is a data frame with a list of all the job titles and summary for each job. However I only get a dataframe with the jobs from the last page
import requests
from bs4 import BeautifulSoup
summaries = [] # <-- outside of the loop
jobs = [] # <-- outside of the loop
url = 'https://ng.indeed.com/jobs?q=Business+Intelligence+Analyst&l=Nigeria&start='
for i in range(0,80,10):
page = requests.get(url+str(i))
soup = BeautifulSoup(page.text, 'html.parser')
for div in soup.find_all(name='div',attrs={'class':'row'}):
for a in div.find_all(name='a', attrs={'data-tn-element':'jobTitle'}):
jobs.append(a['title'])
divs = soup.findAll('div', attrs={'class':'summary'})
for d in divs:
summaries.append(d.text.strip())
jobs = pd.DataFrame({'title': jobs, # <--- put only jobs here
'summary': summaries}) # <--- put only summaries here
print(jobs)
Prints:
title summary
0 Analyst, Customer Intelligence (Supervisory) Provide Intelligence To Support Business Plann...
1 Business Intelligence Analyst Demonstrable work experience in business intel...
2 Manager, Business Intelligence Provide Business Intelligence Services For CEO...
3 MTNN Need Digital Communication Analyst Work With Individual Units (Corporate Communic...
4 MARKET RESEARCH & BUSINESS INTELLIGENCE OFFICER Implement the overall analytics and business i...
.. ... ...
80 Research Analyst and Associates Experience of Business Intelligence tools.\nIn...
81 Financial Analyst Perform market research, data mining, business...
82 Oracle E-business Suite Developer (Fusion) Work Directly with Business user as an oracle ...
83 Junior Oracle Developer Work Directly with Business user as an oracle ...
84 Credit Analyst at CARS45 Limited High business research skills acumen.\nUnderst...
[85 rows x 2 columns]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.