I'm able to fully scrape the material I needed the problem is I can't get the data into Excel
.
Here is my code: (Everything works)
def scrape_data(card):
try:
h2 = card.h2
except:
title = ''
else:
title = h2.text.strip()
try:
price = card.find('span', class_='a-offscreen').text
except:
price = ''
else:
price = card.find('span', class_='a-offscreen').text
data = {'Titulo':title, 'Preco': price}
return data
def main():
url = 'https://www.amazon.com.br/s?k=iphone'
html = get_html(url)
soup = BeautifulSoup(html,'lxml')
cards = soup.find_all('div', {'data-asin':True, 'data-component-type': 's-search-result'})
ads_data = []
for card in cards:
data = scrape_data(card)
ads_data.append(data)
write_xlsx(ads_data)
But I'm stuck here, I don't know how to iterate dictionary on Excel
file...
def write_xlsx(ads):
with xlsxwriter.Workbook('results.xlsx') as workbook:
worksheet = workbook.add_worksheet()
worksheet.write(0,0,'Titulo')
worksheet.write(0,1,'Preco')
for i,(k,v) in enumerate(ads.items(),start=1):
worksheet.write(i,0,k)
worksheet.write(i,0,v)
If your problem is simply to iterate over the entry of the dict i think i can help you with this:
items = ads.items()
for i in range(1, len(items)):
k, v = items[i]
worksheet.write(i, 0, k)
worksheet.write(i, 0, v)
I think you are overcomplicating things. :)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.