[英]Fetch Data in CSV files using pandas
我試圖從網站上獲取漂亮的 50 家公司的股票歷史數據並將它們轉換為 CSV。 我需要每天更新相同的內容。 有什么方法可以將當前日期數據附加到現有的 CSV 文件中,而無需一次又一次地下載它。 我的代碼是這樣的:-
import os
import csv
import urllib.request as urllib
import datetime as dt
import pandas as pd
import pandas_datareader.data as web
import nsepy as nse
def saveNiftySymbols():
url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urllib.Request(url, headers = headers)
# open the url
x = urllib.urlopen(req)
sourceCode = x.read().decode('utf-8')
cr = csv.DictReader(sourceCode.splitlines())
l = [row['Symbol'] for row in cr]
return l
def fetchDataFromNse(l):
if not os.path.exists('stock_dfs'):
os.makedirs('stock_dfs')
start = dt.datetime(2000, 1, 1)
end = dt.datetime.today()
for symbol in l:
if not os.path.exists('stock_dfs/{}.csv'.format(symbol)):
df=nse.get_history(symbol,start, end)
df.to_csv('stock_dfs/{}.csv'.format(symbol))
else:
print('Already have {}'.format(symbol))
fetchDataFromNse(saveNiftySymbols())
嘗試這個
def saveNiftySymbols():
url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urllib.request.Request(url, headers = headers)
url_req = urllib.request.urlopen(req)
## Use pandas here. Much more reliable
table = pd.read_csv(url_req)
return table.Symbol
def fetchDataFromNse(symbols):
if not os.path.exists('stock_dfs'):
os.makedirs('stock_dfs')
## given that you have already stored the data, just have the end date
start = dt.date(2000,3,31)
end = dt.date.today()
for symbol in symbols:
## you can also convert it to a list if you want.
df = nse.get_history(symbol, end, end)
data_to_append = df.to_csv(header= None)
current_csv = open('stock_dfs/{}.csv'.format(symbol), 'a')
current_csv.write(data_to_append)
current_csv.close()
fetchDataFromNse(saveNiftySymbols())
是的,在這兩個函數之前添加這段代碼,並在從 NSE 獲取數據時調用它
def get_last_date():
all_files = glob.glob('stock_dfs/*.csv')
first_csv = open(all_files[0], 'r')
reader = csv.DictReader(first_csv)
last_date_str = list(reader)[-1]['Date']
fmt = '%Y-%m-%d'
last_date_dt = dt.datetime.strptime(last_date_str, fmt).date()
return last_date_dt
def fetchDataFromNse(symbols):
if not os.path.exists('stock_dfs'):
os.makedirs('stock_dfs')
## given that you have already stored the data, just have the end date
start = dt.date(2000,3,31)
### addition here
new_start = get_last_date()
end = dt.date.today()
for symbol in symbols:
## you can also convert it to a list if you want.
df = nse.get_history(symbol, new_start, end)
data_to_append = df.to_csv(header= None)
current_csv = open('stock_dfs/{}.csv'.format(symbol), 'a')
current_csv.write(data_to_append)
current_csv.close()
請記住,只有當組成部分相同並一起更新時,這才會可靠地工作。 滿足第二個條件,但您必須針對第一個條件進行修改。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.