简体   繁体   中英

Python: Timestamp error on matplotlib line plot x-axis

I am trying to produce a line plot from a csv file with the data formatted:

Time,Temp
05 Oct 4:35 pm,68
05 Oct 4:30 pm,68
05 Oct 4:20 pm,68

The code I used is:

import matplotlib.pyplot as plt
import csv

x = []
y = []

with open('time_temp.csv', 'r') as csvfile:
    plots = csv.reader(csvfile, delimiter=',')
    for row in plots:
        x.append(int(row[0]))
        y.append(int(row[1]))

plt.plot(x, y, label='Loaded from file')

plt.xlabel('Timestamp')
plt.ylabel('Temperature')
plt.title('Temperature by Timestamp')
plt.legend()
plt.show()

However it produces this error:

Traceback (most recent call last):
  File "visualizingdata.py", line 12, in <module>
    x.append(int(row[0]))
ValueError: invalid literal for int() with base 10: 'Time'

I believe this is due to the timestamp format but don't know how to convert it.

Please help. Thank you.

Here is one solution with two problems fixed:

with open('time_temp.csv', 'r') as csvfile:
    plots = csv.reader(csvfile, delimiter=',')
    plots.next()
    for row in plots:
        temp = row[0].split()
        x.append(int(temp[0]))
        y.append(int(row[1]))

The first problem in your program is that you are trying to convert the strings in the file headers to int using an int command. To avoid this you can skip the header using plots.next() .

The next problem is that row[0] is an actual string with date that cannot be converted directly to an int using int command. To fix this you can split() the row[0] string and use it's first element. The later part is left as it is.

These modifications should solve your actual problem which I assume is plotting the data against time appearing as time stamps on the x-axis:

labels = []
y = []
with open('time_temp.csv', 'r') as csvfile:
    plots = csv.reader(csvfile, delimiter=',')
    plots.next()
    for row in plots:
        labels.append(row[0])
        y.append(int(row[1]))

labels = labels[::-1]
x = range(len(labels))
plt.xticks(x, labels, rotation='horizontal')

The new parts here is that the time stamp data from row[0] is now appended to a list labels that is later used to generate tick labels for the x-axis. The x-axis values are actually just sequential integers generated with a range command which length matches the data length.

Also, in your example data set the dates seem to go from most recent to the least recent. This is taken care of by inverting the labels using labels = labels[::-1] . Labels are added to the plot using xticks .

I would suggest not to reinvent the wheel and use some existing functionality to obtain datetimes directly. One option is to use pandas.

If the data looks like this (I added some data to show the effect of dissimilar spacings and unordered data):

Time,Temp
05 Oct 10:32 am,10
05 Oct 4:35 pm,20
05 Oct 4:30 pm,30
05 Oct 4:20 pm,68

the code could then look like this:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("data/timetemp.csv")
df["Time"] = pd.to_datetime(df["Time"], format="%d %b %I:%M %p")
df.sort_values("Time", inplace=True)

plt.plot(df["Time"],df["Temp"])

plt.show()

在此处输入图片说明

You could optionally also use pandas for plotting:

# optionally use pandas for plotting:
df.plot(x="Time", y="Temp")

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM