简体   繁体   English

如何在 Python 中为从 t=0 开始的每个 ID 多行 plot

[英]How to plot multiline for each ID starting at t=0 in Python

So, I have a panel-time series dataset, but since for each ID the starting date is different, i created an additional variable t that is a count variable, t=0, 1, 2, 3, .... and the end date is all different.所以,我有一个面板时间序列数据集,但由于每个 ID 的开始日期不同,我创建了一个附加变量 t,它是一个计数变量,t=0,1,2,3,...。结束日期完全不同。 Using the data, I want to plot:使用数据,我想 plot:

  1. a multiline graph where x-axis is "t" =0,1,2,3... and y-axis is "growth" for all IDs一个多线图,其中 x 轴是“t”=0,1,2,3...,y 轴是所有 ID 的“增长”
  2. additionally, if i want to have x-axis start from t=1 instead of t=0?另外,如果我想让 x 轴从 t=1 而不是 t=0 开始?

thank you!谢谢你!

Current table:当前表:

ID  date     growth  t
x1a 1/1/2018    1.2  0
x1a 2/1/2018    1    1
x1a 3/1/2018    3    2
x1a 4/1/2018    2    3
x1a 5/1/2018    0.9  4
z8d 3/1/2018    0.7  0
z8d 3/2/2018    1    1
z8d 3/3/2018    0.8  2
z8d 3/4/2018    0.6  3
z8d 3/5/2018    2.3  4
z8d 3/6/2018    1.7  5
z8d 3/7/2018    1    6
z8d 3/8/2018    2.1  7
j2u 1/1/2020    0.9  0
j2u 1/2/2020    0.8  1
j2u 1/3/2020    1.3  2
j2u 1/4/2020    1.4  3
j2u 1/5/2020    2    4
j2u 1/6/2020    1.4  5
..    ..         ..  ..

You don't need the "t" column, you could use the index for that.您不需要“t”列,您可以使用索引。 To plot a line for each id, you could groupby id, then iterate over the groups and plot.到 plot 为每个 id 一行,您可以按 id 分组,然后遍历组和 plot。 Here is an example of how to do that:以下是如何执行此操作的示例:

from io import StringIO

import matplotlib.pyplot as plt
import pandas as pd

data = """ID  date     growth  t
x1a 1/1/2018    1.2  0
x1a 2/1/2018    1    1
x1a 3/1/2018    3    2
x1a 4/1/2018    2    3
x1a 5/1/2018    0.9  4
z8d 3/1/2018    0.7  0
z8d 3/2/2018    1    1
z8d 3/3/2018    0.8  2
z8d 3/4/2018    0.6  3
z8d 3/5/2018    2.3  4
z8d 3/6/2018    1.7  5
z8d 3/7/2018    1    6
z8d 3/8/2018    2.1  7
j2u 1/1/2020    0.9  0
j2u 1/2/2020    0.8  1
j2u 1/3/2020    1.3  2
j2u 1/4/2020    1.4  3
j2u 1/5/2020    2    4
j2u 1/6/2020    1.4  5"""

df = pd.read_csv(StringIO(data), sep='\s+')
df['date'] = pd.to_datetime(df['date'])

for id_, df in df.groupby(by='ID'):
    df.sort_values(by='date', inplace=True)
    df.reset_index(drop=True, inplace=True)
    plt.plot(df.index + 1, df['growth'], label=id_)


plt.legend()
plt.xlabel('Index')
plt.ylabel('Growth')
plt.show()

You can reshape your data into a form that makes it easy for pandas to plot them:您可以将数据重新整形为一种形式,使 pandas 到 plot 变得容易:

from io import StringIO

import matplotlib.pyplot as plt
import pandas as pd

data = """ID  date     growth  t
x1a 1/1/2018    1.2  0
x1a 2/1/2018    1    1
x1a 3/1/2018    3    2
x1a 4/1/2018    2    3
x1a 5/1/2018    0.9  4
z8d 3/1/2018    0.7  0
z8d 3/2/2018    1    1
z8d 3/3/2018    0.8  2
z8d 3/4/2018    0.6  3
z8d 3/5/2018    2.3  4
z8d 3/6/2018    1.7  5
z8d 3/7/2018    1    6
z8d 3/8/2018    2.1  7
j2u 1/1/2020    0.9  0
j2u 1/2/2020    0.8  1
j2u 1/3/2020    1.3  2
j2u 1/4/2020    1.4  3
j2u 1/5/2020    2    4
j2u 1/6/2020    1.4  5"""

df = pd.read_csv(StringIO(data), sep='\s+')
df['date'] = pd.to_datetime(df['date'])


#the actual plotting starts here
#reshape your data for the plot from long to wide format
df_plot = pd.pivot(df, index="t", columns="ID", values="growth")
#renumber the index
df_plot.index += 1
#let pandas matplotlib wrapper do the plotting
df_plot.plot.line()

plt.show()

Sample output:样品 output: 在此处输入图像描述

Disclaimer: The sample data import is shamelessly copied from Leonardo's answer .免责声明:示例数据导入是从莱昂纳多的答案中无耻复制的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM