[英]Need to write scraped data into csv file (threading)
這是我的代碼:
from download1 import download
import threading,lxml.html
def getInfo(initial,ending):
for Number in range(initial,ending):
Fields = ['country', 'area', 'population', 'iso', 'capital', 'continent', 'tld', 'currency_code',
'currency_name', 'phone',
'postal_code_format', 'postal_code_regex', 'languages', 'neighbours']
url = 'http://example.webscraping.com/places/default/view/%d'%Number
html=download(url)
tree = lxml.html.fromstring(html)
results=[]
for field in Fields:
x=tree.cssselect('table > tr#places_%s__row >td.w2p_fw' % field)[0].text_content()
results.append(x)#should i start writing here?
downloadthreads=[]
for i in range(1,252,63): #create 4 threads
downloadThread=threading.Thread(target=getInfo,args=(i,i+62))
downloadthreads.append(downloadThread)
downloadThread.start()
for threadobj in downloadthreads:
threadobj.join() #end of each thread
print "Done"
因此results
將具有Fields
的值,我需要將Fields
作為第一行寫入數據(僅一次),然后將results
的值寫入CSV文件。 我不確定我是否可以在函數中打開文件,因為線程會同時多次打開文件。
注意:我知道抓取時不希望使用線程,但我只是在測試
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.