简体   繁体   中英

How to plot multiple timeseries data with different start date on the same x-axis in Python Matplotlib?

I am trying to plot three timeseries datasets with different start date on the same x-axis, similar to this question How to plot timeseries with different start date on the same x axis . Except that my x-axis has dates instead of days.

My data frame is structured as:

Date ColA Label
01/01/2019 1.0 Training
02/01/2019 1.0 Training
...
14/09/2020 2.0 Test1
..
06/01/2021 4.0 Test2
...

I have defined each time series as:

train = df.loc['01/01/2019':'05/08/2020', 'ColA']  
test1 = df.loc['14/09/2020':'20/12/2020', 'ColA']  
test2 = df.loc['06/01/2021':'18/03/2021', 'ColA']  

This is how individual time series plot: 数据1 数据2 数据3

But when I try to plot them on the same x-axis, it doesn't plot in sequence of dates 数据全部 I am hoping to produce something like this (from MS Excel): 在此处输入图像描述

Any help would be great!

Thank you

Make sure that 'Date' column in your dataframe is imported as datetime variable and not as string.

If you find dtype as "object":

df = pd.read_csv('data.csv')
data['Date']
0      2019-01-01
1      2019-01-02
2      2019-01-03
       

    Name: Date, Length: 830, dtype: object

You need to convert to datetime variable. You can convert in two ways:

  1.  df = pd.read_csv('data.csv', parse_dates=['Date'])

OR

  1. df = pd.read_csv('data.csv') df['Date'] = pd.to_datetime(data['Date'])

Both options will give you the same result.

df = pd.read_csv('data.csv', parse_dates=['Date'])
data['Date']
0      2019-01-01
1      2019-01-02
2      2019-01-03
       ...

    Name: Date, Length: 830, dtype: datetime64[ns]

Then, you can just plot:

plt.plot(data['Date'],ColA)

When you define individual time series, make sure to check the formatting of dates. Datetime format in pandas is YYYY-MM-DD. So, use this instead:

train = df.loc['2019-01-01':'2020-08-05', 'ColA'] and so on...

I am assuming that your data is stored as csv (or excel). If so, be careful of how MS Excel may change the formatting of the Date column anytime you open the data file in Excel. Best practice would be to always check the formatting of 'Date' column using

type(data['Date']) after importing dataframe.

I assume you have a dataframe consists at least of date , record , and label of training, test #1 and test#2
would sharex = True do the trick?

fig, ax = plt.subplots(3,1, sharex = True)

for i,j in zip(data['label'].unique(), range(3)):
    ax[j].plot(x = df[df['label'] == i]['date'], 
               y = df[df['label'] == i]['record'])

EDIT

This should do it

fig, ax = plt.subplots(figsize = (14,6))
color = ['blue','red','orange']

for i,j in zip(df.Label.unique().tolist(), color):
    ax.plot(x = df['Date'][df.Label == i], y = df['ColA'][df.Label == i], 
            color = j, label = j)
plt.legend(loc = 'best')
plt.show()

You basically want to plot multiple times in the same figure of matplotlib. Just use the initial dataset (which includes all the labels), no need to use the separated one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM