使用 Pandas 獲取 CSV 文件中的數據

Question

我試圖從網站上獲取漂亮的 50 家公司的股票歷史數據並將它們轉換為 CSV。 我需要每天更新相同的內容。 有什么方法可以將當前日期數據附加到現有的 CSV 文件中，而無需一次又一次地下載它。 我的代碼是這樣的：-

import os
import csv
import urllib.request as urllib
import datetime as dt
import pandas as pd
import pandas_datareader.data as web
import nsepy as nse

def saveNiftySymbols():
    url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
    req = urllib.Request(url, headers = headers)
# open the url 
    x = urllib.urlopen(req)
    sourceCode = x.read().decode('utf-8') 

    cr = csv.DictReader(sourceCode.splitlines())
    l = [row['Symbol'] for row in cr]
    return l

def fetchDataFromNse(l):
    if not os.path.exists('stock_dfs'):
        os.makedirs('stock_dfs')

    start = dt.datetime(2000, 1, 1)
    end = dt.datetime.today()

    for symbol in l:
        if not os.path.exists('stock_dfs/{}.csv'.format(symbol)):
            df=nse.get_history(symbol,start, end)
            df.to_csv('stock_dfs/{}.csv'.format(symbol))
        else:
            print('Already have {}'.format(symbol))

fetchDataFromNse(saveNiftySymbols())

Answer 1

收市后試試這個，因為 NSE 因添加沒有數據的日期而臭名昭著。
這僅在您已經為 NSE 的符號存儲了數據時才有效。 這不考慮成分的任何變化。 這意味着當 NSE 更改成分時，您必須一次下載所有內容。

嘗試這個

def saveNiftySymbols():
    url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
    # pretend to be a chrome 47 browser on a windows 10 machine
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}          
    req = urllib.request.Request(url, headers = headers)
    url_req = urllib.request.urlopen(req)
    ## Use pandas here. Much more reliable
    table = pd.read_csv(url_req)
    return table.Symbol

def fetchDataFromNse(symbols):
    if not os.path.exists('stock_dfs'):
        os.makedirs('stock_dfs')

    ## given that you have already stored the data, just have the end date
    start = dt.date(2000,3,31)
    end = dt.date.today()
    for symbol in symbols:
        ## you can also convert it to a list if you want.
        df = nse.get_history(symbol, end, end)
        data_to_append = df.to_csv(header= None)
        current_csv = open('stock_dfs/{}.csv'.format(symbol), 'a')
        current_csv.write(data_to_append)
        current_csv.close()



fetchDataFromNse(saveNiftySymbols())

Answer 2

是的，在這兩個函數之前添加這段代碼，並在從 NSE 獲取數據時調用它

def get_last_date():
    all_files = glob.glob('stock_dfs/*.csv')
    first_csv = open(all_files[0], 'r')
    reader = csv.DictReader(first_csv)
    last_date_str = list(reader)[-1]['Date']
    fmt = '%Y-%m-%d'
    last_date_dt = dt.datetime.strptime(last_date_str, fmt).date()
    return last_date_dt

def fetchDataFromNse(symbols):
    if not os.path.exists('stock_dfs'):
        os.makedirs('stock_dfs')

    ## given that you have already stored the data, just have the end date
    start = dt.date(2000,3,31)

    ### addition here
    new_start = get_last_date()
    end = dt.date.today()
    for symbol in symbols:
        ## you can also convert it to a list if you want.
        df = nse.get_history(symbol, new_start, end)
        data_to_append = df.to_csv(header= None)
        current_csv = open('stock_dfs/{}.csv'.format(symbol), 'a')
        current_csv.write(data_to_append)
        current_csv.close()

請記住，只有當組成部分相同並一起更新時，這才會可靠地工作。 滿足第二個條件，但您必須針對第一個條件進行修改。

使用 Pandas 獲取 CSV 文件中的數據

問題描述

2 個解決方案

解決方案1
1 已采納 2017-10-11 04:49:53

解決方案2
0 2017-10-12 14:55:22

使用 Pandas 獲取 CSV 文件中的數據

問題描述

2 個解決方案

解決方案1 1 已采納 2017-10-11 04:49:53

解決方案2 0 2017-10-12 14:55:22

解決方案1
1 已采納 2017-10-11 04:49:53

解決方案2
0 2017-10-12 14:55:22