简体   繁体   中英

How can I convert list of crawled data to excel column?

import openpyxl
xl_file = openpyxl.Workbook()
xl_sheet =xl_file.active

from urllib.request import urlopen
from bs4 import BeautifulSoup

stockItem = '028300'

url = 'http://finance.naver.com/item/sise_day.nhn?code='+ stockItem
html = urlopen(url) 
source = BeautifulSoup(html.read(), "html.parser")

maxPage=source.find_all("table",align="center")
mp = maxPage[0].find_all("td",class_="pgRR")
mpNum = int(mp[0].a.get('href')[-3:])

for page in range(1, 10):
   print (str(page) )
   url = 'http://finance.naver.com/item/sise_day.nhn?code=' + stockItem +'&page='+ str(page)
   html = urlopen(url)
   source = BeautifulSoup(html.read(), "html.parser")
   srlists=source.find_all("tr")
   isCheckNone = None

   if((page % 1) == 0):
      time.sleep(0)

   for i in range(1,len(srlists)-1):
      if(srlists[i].span != isCheckNone):

          srlists[i].td.text
          data1 = srlists[i].find_all("td",align="center")
          data2 = srlists[i].find_all("td",class_="num") 
          print(srlists[i].find_all("td",align="center")[0].text, srlists[i].find_all("td",class_="num")[0].text )

          for item in data1:
             xl_sheet.append([item.get_text()])

This is what I've done for crawling stock data from the site. I've successfully crawled the data of stock. However, I couldn't save the data into excel file. I've tried it, but it only showed the date data without the price data. How could I convert results to excel file?

There were 2 things you misssed, 1) Mistake in importing packages 2) Did not append data2 in excel which contains prices

Here is the final code which will give your desired output. Just put your folder location for saving excel file.

import time
from openpyxl import Workbook  #
xl_file = Workbook()
xl_sheet =xl_file.active

from urllib.request import urlopen
from bs4 import BeautifulSoup
i = 0
stockItem = '028300'

url = 'http://finance.naver.com/item/sise_day.nhn?code='+ stockItem
html = urlopen(url) 
source = BeautifulSoup(html.read(), "html.parser")

maxPage=source.find_all("table",align="center")
mp = maxPage[0].find_all("td",class_="pgRR")
mpNum = int(mp[0].a.get('href')[-3:])

for page in range(1, 10):
   print (str(page) )
   url = 'http://finance.naver.com/item/sise_day.nhn?code=' + stockItem +'&page='+ str(page)
   html = urlopen(url)
   source = BeautifulSoup(html.read(), "html.parser")
   srlists=source.find_all("tr")
   isCheckNone = None

   if((page % 1) == 0):
      time.sleep(0)

   for i in range(1,len(srlists)-1):
      if(srlists[i].span != isCheckNone):

          srlists[i].td.text
          data1 = srlists[i].find_all("td",align="center")
          data2 = srlists[i].find_all("td",class_="num") 
          #print(srlists[i].find_all("td",align="center")[0].text, srlists[i].find_all("td",class_="num")[0].text )

          for item1,item2 in zip(data1,data2):
              xl_sheet.append([item.get_text(),item2.get_text()])


print(xl_sheet)
xl_file.save(r'C:\Users\Asus\Desktop\vi.xlsx')

Suggestion: You can use Yahoofinance package for python to download stock data easily. you can follow this link >> https://pypi.org/project/yfinance/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM