[英]Reading a CSV file to multiple NumPy arrays in Python
I am trying to import a .csv file containing various stock prices into a Python script inside a getData() function but I am having trouble with indexes and can't see how to resolve the problem. 我正在尝试将包含各种股票价格的.csv文件导入到getData()函数中的Python脚本中,但是索引遇到问题,无法看到如何解决该问题。
I am new to both CSV and NumPy so am unsure where the problem is exactly, but when I attempt to run this code I receive the following: 我是CSV和NumPy的新手,所以不确定问题出在哪里,但是当我尝试运行此代码时,我收到以下信息:
File "../StockPlot.py", line 20, in getData date[i-1] = data[0] IndexError: index 0 is out of bounds for axis 0 with size 0 文件“ ../StockPlot.py”,第20行,位于getData中date [i-1] = data [0] IndexError:索引0超出了轴0的大小为0的范围
import numpy as np
import matplotlib.pyplot as plt
import csv
def getData():
date = np.array([])
openPrice = np.array([])
closePrice = np.array([])
volume = np.array([])
i = 1
with open('aapl.csv', 'rb') as f:
reader = csv.reader(open('aapl.csv'))
data_as_list = list(reader)
items = len(data_as_list)
while i < items:
data = data_as_list[i]
date[i-1] = data[0]
openPrice[i-1] = data[1]
closePrice[i-1] = data[4]
volume[i-1] = data[5]
i += 1
return date, openPrice, closePrice, volume
getData()
The AAPL.csv file I am trying to read has lines taking the form: 我尝试读取的AAPL.csv文件的行格式如下:
Date, Open, High, Low, Close, Volume
日期,打开,高,低,关闭,音量
26-Jul-17,153.35,153.93,153.06,153.46,15415545
2015年7月26日,153.35,153.93,153.06,153.46,15415545
25-Jul-17,151.80,153.84,151.80,152.74,18853932
2015年7月25日,151.80,153.84,151.80,152.74,18853932
24-Jul-17,150.58,152.44,149.90,152.09,21493160
2015年7月24日,150.58,152.44,149.90,152.09,21493160
I would appreciate any help solving this problem, it seems that the data_as_list is a list of lists of each line, and after playing around with the print function it seems to be printing data[0] etc. inside the while loop but won't allow me to assign the values to the arrays I have created 我将不胜感激,可以帮助您解决此问题,似乎data_as_list是每行列表的列表,并且在使用print函数后,似乎在while循环内打印了data [0]等,但不会允许我将值分配给我创建的数组
IMO it's much more convenient to use Pandas for that: IMO,使用Pandas更为方便:
import pandas as pd
fn = r'/path/to/AAPL.csv'
df = pd.read_csv(fn, skipinitialspace=True, parse_dates=['Date'])
Result: 结果:
In [83]: df
Out[83]:
Date Open High Low Close Volume
0 2017-07-26 153.35 153.93 153.06 153.46 15415545
1 2017-07-25 151.80 153.84 151.80 152.74 18853932
2 2017-07-24 150.58 152.44 149.90 152.09 21493160
As numpy 2D array: 作为numpy 2D数组:
In [84]: df.values
Out[84]:
array([[Timestamp('2017-07-26 00:00:00'), 153.35, 153.93, 153.06, 153.46, 15415545],
[Timestamp('2017-07-25 00:00:00'), 151.8, 153.84, 151.8, 152.74, 18853932],
[Timestamp('2017-07-24 00:00:00'), 150.58, 152.44, 149.9, 152.09, 21493160]], dtype=object)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.