[英]Scraped data to csv file beautifulsoup
正如標題所說,我使用 beautifulsoup 刮板從網站上抓取數據工作正常,但是當我嘗試將數據加載到 csv 文件中時,它只保存了 1 個數據區域,而不是刮板在此處提供的 500 個數據區域是我的代碼:
#from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import csv
#launch url
url = "https://www.canlii.org/en/#search/type=decision&jId=bc,ab,sk,mb,on,qc,nb,ns,pe,nl,yk,nt,nu&startDate=1990-01-01&endDate=1992-01-14&text=non-pecuniary%20award%20&resultIndex=1"
# create a new Chrome session
driver = webdriver.Chrome('C:\Program Files (x86)\Microsoft Visual Studio\Shared\Anaconda3_64\lib\site-packages\selenium\webdriver\common\chromedriver.exe')
driver.implicitly_wait(30)
driver.get(url)
#Selenium hands the page source to Beautiful Soup
soup=BeautifulSoup(driver.page_source, 'lxml')
csv_file = open('test.csv', 'w')
csv_writer = csv.writer(csv_file, quoting=csv.QUOTE_ALL)
csv_writer.writerow(['Reference', 'case', 'link', 'province', 'keywords','snippets'])
#Scrape all
for scrape in soup.find_all('li', class_='result '):
print(scrape.text)
#Reference Index
Reference = scrape.find('span', class_='reference')
print(Reference.text)
#Case Name Index
case = scrape.find('span', class_='name')
print(case.text)
#Canlii Keywords Index
keywords = scrape.find('div', class_='keywords')
print(keywords.text)
#Province Index
province = scrape.find('div', class_='context')
print(province.text)
#snippet Index
snippet = scrape.find('div', class_='snippet')
print(snippet.text)
# Extracting URLs from the attribute href in the <a> tags.
link = scrape.find('a', href=True)
print(link)
csv_writer.writerow([Reference.text, case.text,link.href, province.text, keywords.text, snippet.text])
csv_file.close()
您的 csv_writer.writerow() 函數在您的for循環之外。 嘗試縮進它,看看它是否有效。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.