简体   繁体   English

在 plot 中创建多行,按较大 dataframe 的选定列分组

[英]Creating multiple lines in plot grouped by selected column of larger dataframe

I have a dataframe, example sample below:我有一个 dataframe,示例如下:

import datetime
import pandas as pd

ids =[1, 2, 3, 1, 2, 3]
vals = [3, 5, 6, 3, 7, 8]
lats = [10, 10, 10, 30, 30, 30]
ratio = [.1, .4, .2, .3, .4, .5,]

df = pd.DataFrame({'ids' : ids, 'vals' : vals, 'lats' : lats, 'ratio' : ratio})


>>>df
    ids vals    lats    ratio
0   1   3       10      0.1
1   2   5       10      0.4
2   3   6       10      0.2
3   1   3       30      0.3
4   2   7       30      0.4
5   3   8       30      0.5

I want to create a graph with lines that have ratio on the y-axis, lats on the x-axis and are grouped by the ids column.我想创建一个图表,其中的线在 y 轴上具有ratio ,在 x 轴上具有lats并按ids列分组。 All the questions I've found use groupby or pivot on a dataframe that is used fully, and not a selection of columns.我发现的所有问题都在完全使用的 dataframe 上使用 groupby 或 pivot,而不是选择列。

I need to make more graphs on my true dataframe, which has many more columns and therefore would like to know how to plot this by selecting specified columns.我需要在我的真实 dataframe 上制作更多图表,它有更多的列,因此想知道如何通过选择指定的列来 plot。

You can use the grouby function follow by a for loop, then, use the plot function for each of the groups, passing the desired columns as x and y (in this particular order, if you wish to maintain the described plot).您可以使用 grouby function 后跟一个 for 循环,然后对每个组使用plot function,将所需的列作为xy传递(如果您希望保留所描述的图,则按此特定顺序)。

import matplotlib.pyplot as plt
...
...

x_axis = 'lats'  # specified columns
y_axis = 'ratio' # specified columns

groups = df.groupby('ids')
for n,g in groups:
    plt.plot(g[x_axis], g[y_axis], label=f'ID-{n}')

plt.xlabel(x_axis.capitalize())
plt.ylabel(y_axis.capitalize())
plt.legend()
plt.grid(True)
plt.show()

Another way of plot Pandas dataframes columns is passing the data argument to the plot function and the name of the columns as strings: plot Pandas 数据帧列的另一种方法是将data参数传递给plot function 并将列名称作为字符串:

Instead of giving the data in x and y, you can provide the object in the data parameter and just give the labels for x and y您可以在数据参数中提供 object 而不是在 x 和 y 中提供数据,只给出 x 和 y 的标签

But here, you would still have to pass the dataframe group on each iteration但是在这里,您仍然必须在每次迭代中传递 dataframe 组

plt.plot('lats', 'ratio', data=g)

分组线

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM