簡體   English   中英

將抓取的數據寫入CSV

[英]Writing scraped data to csv

我一直試圖將我抓取的數據傳輸到csv文件中。 這是我的代碼:

import requests, bs4, csv, sys
reload(sys)
sys.setdefaultencoding('utf-8')
url = 'http://www.constructeursdefrance.com/resultat/?dpt=01'

res = requests.get(url)
res.raise_for_status()
soup = bs4.BeautifulSoup(res.text,'html.parser')
links = []

for div in soup.select('.link'):
    link = div.a.get('href')
    links.append(link)
for i in links:
    url2 = i
    res2 = requests.get(url2)
    soup2 = bs4.BeautifulSoup(res2.text, 'html.parser')
    for each in soup2.select('li > strong'):
        data = each.text, each.next_sibling
    with open('french.csv', 'wb') as file:
        writer = csv.writer(file)
        writer.writerows(data)

輸出顯示:

Traceback (most recent call last):
File "test_new_project.py", line 23, in <module>
writer.writerows(data)
csv.Error: sequence expected

但是我正在嘗試將元組放入csv文件中,只要我知道csv接受元組和列表即可。 我該如何解決這個問題?

改變這個

for each in soup2.select('li > strong'):
        data = each.text, each.next_sibling

對此

data=[]
for each in soup2.select('li > strong'):
        data.append((each.text, each.next_sibling))

您的數據變量是一個元組而不是元組列表。 上面的代碼創建一個元組列表。

其他解決方案是這個(注意縮進)

data = []
for i in links:
    url2 = i
    res2 = requests.get(url2)
    soup2 = bs4.BeautifulSoup(res2.text, 'html.parser')
    for each in soup2.select('li > strong'):
        data.append((each.text, each.next_sibling))
with open('french.csv', 'wb') as file:
    writer = csv.writer(file)
    writer.writerows(data)

Atirag是正確的,但是您還有另一個問題,即打開輸出文件的with調用嵌套在for循環中。 因此,如果有多個鏈接,則每次都將覆蓋該文件,並且輸出將不是您期望的。 我認為這應該生成您想要的輸出:

for div in soup.select('.link'):
    link = div.a.get('href')
    links.append(link)

with open("french.csv", "w") as file:
    writer = csv.writer(file)
    for i in links:
        res2 = requests.get(i)
        soup2 = bs4.BeautifulSoup(res2.text, 'html.parser')
        for each in soup2.select('li > strong'):
            writer.writerow([each.text, each.next_sibling])

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM