简体   繁体   English

如何使用 pandas 和 seaborn 进行 plot 多个时间序列

[英]How to plot multiple times series using pandas and seaborn

I've got a dataframe of data that people have helpfully collate.我有 dataframe 的数据,人们帮助整理了这些数据。

It looks like this (ignore the index, I'm just sampling):它看起来像这样(忽略索引,我只是抽样):

    uni                             score   year
18  Arden University Limited        78.95   2020
245 The University of Manchester    71.35   2022
113 Darlington College              93.33   2020
94  City of Wolverhampton College   92      2017
345 The Royal Veterinary College    94      2018
118 Darlington College              62      2018

There is more data - https://github.com/elksie5000/uni_data/blob/main/uni_data_combined.csv - but my view is to set_index on year and then filter by uni as well as larger groups, aggregated by mean/median.有更多数据 - https://github.com/elksie5000/uni_data/blob/main/uni_data_combined.csv - 但我的观点是 set_index on year 然后按 uni 以及更大的组过滤,按均值/中值汇总。

The ultimate aim is to look at a group of universities and track the metric over time.最终目的是观察一组大学并跟踪一段时间内的指标。

I've managed to create a simple function to plot a simple function to plot the data, thus:我设法创建了一个简单的 function 到 plot 一个简单的 function 到 plot 数据,因此:

#Create a function to plot the data
def plot_uni(df, uni, query):
    print(query)
    df['query'] = df[uni].str.contains(query)
    subset = df[df['query']].set_index("year")
    subset.sort_index().plot()

在此处输入图像描述 I can also plot the overall mean using:我也可以使用 plot 的整体意思:

df.groupby("year").mean()['score'].plot()

What I want to be able to do is plot both together.我想要做的是 plot 两者在一起。

在此处输入图像描述 Ideally, I'd also like to be able to plot multiple lines in one plot and specify the colour.理想情况下,我还希望能够在一个 plot 中包含多行 plot 并指定颜色。 So for instance say the national score is in red and a particular line was say blue, while other plots were gray.例如,全国分数是红色的,一条特定的线是蓝色的,而其他地块是灰色的。

Any ideas?有任何想法吗?

UPDATE:更新:

Answer from @Corralien and @Johannes Schöck both worked. @Corralien 和@Johannes Schöck 的回答都有效。 Just don't know how to change the legend.就是不知道怎么改图例。

You can use the Axis to plot returned by the first call to plot and reuse it in your function:您可以使用第一次调用 plot 返回的轴到plot ,并在您的 function 中重复使用它:

def plot_uni(df, uni, query, ax):  # <- HERE
    print(query)
    df['query'] = df[uni].str.contains(query)
    subset = df[df['query']].set_index("year")
    subset.sort_index().plot(ax=ax)  # <- HERE

# General plot
ax = df.groupby("year")['score'].mean().plot()

plot_uni(df, 'uni', 'College', ax)  # other plots
plot_uni(df, 'uni', 'University', ax)  # and so on

If you use the matplotlib.pyplot way to plotting instead of pandas built-in interface for it, you can simply add more lines by repeatedly calling plt.plot(data).如果你使用 matplotlib.pyplot 方式来绘图而不是 pandas 内置接口,你可以通过重复调用 plt.plot(data) 来简单地添加更多行。 Once you have called all your data, you do plt.show() to generate the output.调用所有数据后,执行 plt.show() 以生成 output。

import matplotlib.pyplot as plt

def plot_uni(df, uni, query):
    print(query)
    df['query'] = df[uni].str.contains(query)
    subset = df[df['query']].set_index("year")
    plt.plot(subset.sort_index())

# Here goes some iterator that calls plot_uni
plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM