简体   繁体   中英

Newbie Matplotlib and Pandas Plotting from CSV file

I haven't had much training with Matplotlib at all, and this really seems like a basic plotting application, but I'm getting nothing but errors.

Using Python 3, I'm simply trying to plot historical stock price data from a CSV file, using the date as the x axis and prices as the y. The data CSV looks like this:

数据

(only just now noticing to big gap in times, but whatever)

import glob
import pandas as pd
import matplotlib.pyplot as plt

def plot_test():
    files = glob.glob('./data/test/*.csv')

    for file in files:
        df = pd.read_csv(file, header=1, delimiter=',', index_col=1)
        df['close'].plot()
        plt.show()

plot_test()

I'm using glob for now just to identify any CSV file in that folder, but I've also tried just designating one specific CSV filename and get the same error:

KeyError: 'close'

I've also tried just designating a specific column number to only plot one particular column instead, but I don't know what's going on.

Ideally, I would like to plot it just like real stock data, where everything is on the same graph, volume at the bottom on it's own axis, open high low close on the y axis, and date on the x axis for every row in the file. I've tried a few different solutions but can't seem to figure it out. I know this has probably been asked before but I've tried lots of different solutions from SO and others but mine seems to be hanging up on me. Thanks so much for the newbie help!

Here on pandas documentation you can find that the header kwarg should be 0 for your csv, as the first row contains the column names. What is happening is that the DataFrame you are building doesn't have the column close , as it is taking the headers from the "second" row. It will probably work fine if you take the header kwarg or change it to header=0 . It is the same with the other kwargs, no need to define them. A simple df = pd.read_csv(file) will do just fine.

You can prettify this according to your needs

import pandas
import matplotlib.pyplot as plt

def plot_test(file):



    df = pandas.read_csv(file)

    # convert timestamp
    df['timestamp'] = pandas.to_datetime(df['timestamp'], format = '%Y-%m-%d %H:%M')



    # plot prices
    ax1 = plt.subplot(211)
    ax1.plot_date(df['timestamp'], df['open'], '-', label = 'open')
    ax1.plot_date(df['timestamp'], df['close'], '-', label = 'close')
    ax1.plot_date(df['timestamp'], df['high'], '-', label = 'high')
    ax1.plot_date(df['timestamp'], df['low'], '-', label = 'low')
    ax1.legend()

    # plot volume
    ax2 = plt.subplot(212)

    # issue: https://github.com/matplotlib/matplotlib/issues/9610
    df.set_index('timestamp', inplace = True)
    df.index.to_pydatetime()

    ax2.bar(df.index, df['volume'], width = 1e-3)
    ax2.xaxis_date()

    plt.show()

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM