[英]Writing scraped data to csv
我一直試圖將我抓取的數據傳輸到csv文件中。 這是我的代碼:
import requests, bs4, csv, sys
reload(sys)
sys.setdefaultencoding('utf-8')
url = 'http://www.constructeursdefrance.com/resultat/?dpt=01'
res = requests.get(url)
res.raise_for_status()
soup = bs4.BeautifulSoup(res.text,'html.parser')
links = []
for div in soup.select('.link'):
link = div.a.get('href')
links.append(link)
for i in links:
url2 = i
res2 = requests.get(url2)
soup2 = bs4.BeautifulSoup(res2.text, 'html.parser')
for each in soup2.select('li > strong'):
data = each.text, each.next_sibling
with open('french.csv', 'wb') as file:
writer = csv.writer(file)
writer.writerows(data)
輸出顯示:
Traceback (most recent call last):
File "test_new_project.py", line 23, in <module>
writer.writerows(data)
csv.Error: sequence expected
但是我正在嘗試將元組放入csv文件中,只要我知道csv接受元組和列表即可。 我該如何解決這個問題?
改變這個
for each in soup2.select('li > strong'):
data = each.text, each.next_sibling
對此
data=[]
for each in soup2.select('li > strong'):
data.append((each.text, each.next_sibling))
您的數據變量是一個元組而不是元組列表。 上面的代碼創建一個元組列表。
其他解決方案是這個(注意縮進)
data = []
for i in links:
url2 = i
res2 = requests.get(url2)
soup2 = bs4.BeautifulSoup(res2.text, 'html.parser')
for each in soup2.select('li > strong'):
data.append((each.text, each.next_sibling))
with open('french.csv', 'wb') as file:
writer = csv.writer(file)
writer.writerows(data)
Atirag是正確的,但是您還有另一個問題,即打開輸出文件的with調用嵌套在for循環中。 因此,如果有多個鏈接,則每次都將覆蓋該文件,並且輸出將不是您期望的。 我認為這應該生成您想要的輸出:
for div in soup.select('.link'):
link = div.a.get('href')
links.append(link)
with open("french.csv", "w") as file:
writer = csv.writer(file)
for i in links:
res2 = requests.get(i)
soup2 = bs4.BeautifulSoup(res2.text, 'html.parser')
for each in soup2.select('li > strong'):
writer.writerow([each.text, each.next_sibling])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.