使用python和漂亮的汤将抓取的数据输出到csv文件的问题

Question

I am trying to output the scrapped data from a website into a csv file, first I was coming across UnicodeEncoding error but after using this piece of code: 我试图将网站中的报废数据输出到一个csv文件中，首先我遇到了UnicodeEncoding错误，但是在使用了这段代码之后：

if __name__ == "__main__":
reload(sys)
sys.setdefaultencoding("utf-8")

I am able to generate the csv, below is the code for the same: 我能够生成csv，下面是相同的代码：

import csv
import urllib2
import sys  
from bs4 import BeautifulSoup
if __name__ == "__main__":
    reload(sys)
    sys.setdefaultencoding("utf-8")
page =    urllib2.urlopen('http://www.att.com/shop/wireless/devices/smartphones.html').read()
soup = BeautifulSoup(page)
soup.prettify()
for anchor in soup.findAll('a', {"class": "clickStreamSingleItem"}):
        print anchor['title']        
        with open('Smartphones.csv', 'wb') as csvfile:
                spamwriter = csv.writer(csvfile, delimiter=',')        
                spamwriter.writerow([(anchor['title'])])

But I am getting only one device name in the output csv, I don't have any programming background, pardon me for the ignorance. 但是我在输出的csv中仅获得一个设备名称，我没有任何编程背景，请原谅我的无知。 Can you please help me pinpoint the issue in this? 您能帮我找出问题所在吗？

Answer 1

That's to be expected; 这是意料之中的； you write the file from scratch each time you find an element. 您每次找到一个元素都从头开始编写文件。 Open the file only once before looping over the links, then write rows for each anchor you find: 在循环浏览链接之前，仅打开文件一次，然后为找到的每个锚写行：

with open('Smartphones.csv', 'wb') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',')        
    for anchor in soup.findAll('a', {"class": "clickStreamSingleItem"}):
        print anchor['title']        
        spamwriter.writerow([anchor['title'].encode('utf8')])

Opening a file for writing with w clears the file first, and you were doing that for each anchor. 使用w打开文件进行写入会首先清除该文件，而您正在对每个锚点进行操作。

As for your unicode error, please avoid, at all cost, changing the default encoding. 至于您的unicode错误，请不惜一切代价避免更改默认编码。 Instead, encode your rows properly; 相反，对行进行正确编码； I did so in the above example, you can remove the whole .setdefaultencoding() call (and the reload() before it). 在上面的示例中，我这样做了，您可以删除整个.setdefaultencoding()调用（以及之前的reload() ）。

使用python和漂亮的汤将抓取的数据输出到csv文件的问题

问题描述

1 个解决方案

解决方案1
1 已采纳 2012-12-19 08:11:25

使用python和漂亮的汤将抓取的数据输出到csv文件的问题

问题描述

1 个解决方案

解决方案1 1 已采纳 2012-12-19 08:11:25

解决方案1
1 已采纳 2012-12-19 08:11:25