简体   繁体   English

如何将单独的 Pandas DataFrames 绘制为子图?

[英]How can I plot separate Pandas DataFrames as subplots?

I have a few Pandas DataFrames sharing the same value scale, but having different columns and indices.我有几个 Pandas DataFrames 共享相同的值范围,但具有不同的列和索引。 When invoking df.plot() , I get separate plot images.调用df.plot()时,我得到单独的绘图图像。 what I really want is to have them all in the same plot as subplots, but I'm unfortunately failing to come up with a solution to how and would highly appreciate some help.我真正想要的是将它们全部与子情节放在同一个情节中,但不幸的是,我未能提出解决方案,并且非常感谢一些帮助。

You can manually create the subplots with matplotlib, and then plot the dataframes on a specific subplot using the ax keyword.您可以使用 matplotlib 手动创建子图,然后使用ax关键字在特定子图上绘制数据帧。 For example for 4 subplots (2x2):例如对于 4 个子图 (2x2):

import matplotlib.pyplot as plt

fig, axes = plt.subplots(nrows=2, ncols=2)

df1.plot(ax=axes[0,0])
df2.plot(ax=axes[0,1])
...

Here axes is an array which holds the different subplot axes, and you can access one just by indexing axes .这里axes是一个包含不同子图轴的数组,您可以通过索引axes来访问一个。
If you want a shared x-axis, then you can provide sharex=True to plt.subplots .如果你想要一个共享的 x 轴,那么你可以提供sharex=Trueplt.subplots

You can see e.gs.你可以看到例如in the documentation demonstrating joris answer.在演示 joris 答案的文档中 Also from the documentation, you could also set subplots=True and layout=(,) within the pandas plot function:同样从文档中,您还可以在 pandas plot函数中设置subplots=Truelayout=(,)

df.plot(subplots=True, layout=(1,2))

You could also use fig.add_subplot() which takes subplot grid parameters such as 221, 222, 223, 224, etc. as described in the post here .你也可以使用fig.add_subplot()如在后所描述这需要副区格参数,如221,222,223,224,等等这里 Nice examples of plot on pandas data frame, including subplots, can be seen in this ipython notebook .可以在这个 ipython notebook 中看到关于 pandas 数据框的很好的绘图示例,包括子图。

You can use the familiar Matplotlib style calling a figure and subplot , but you simply need to specify the current axis using plt.gca() .您可以使用熟悉的Matplotlib风格调用一个figuresubplot ,但你只需指定当前使用轴plt.gca() An example:一个例子:

plt.figure(1)
plt.subplot(2,2,1)
df.A.plot() #no need to specify for first axis
plt.subplot(2,2,2)
df.B.plot(ax=plt.gca())
plt.subplot(2,2,3)
df.C.plot(ax=plt.gca())

etc...等等...

You can plot multiple subplots of multiple pandas data frames using matplotlib with a simple trick of making a list of all data frame.您可以使用 matplotlib 绘制多个 pandas 数据帧的多个子图,并使用一个简单的技巧来制作所有数据帧的列表。 Then using the for loop for plotting subplots.然后使用 for 循环绘制子图。

Working code:工作代码:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# dataframe sample data
df1 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df2 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df3 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df4 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df5 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df6 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])

#define number of rows and columns for subplots
nrow=3
ncol=2

# make a list of all dataframes 
df_list = [df1 ,df2, df3, df4, df5, df6]
fig, axes = plt.subplots(nrow, ncol)

# plot counter
count=0
for r in range(nrow):
    for c in range(ncol):
        df_list[count].plot(ax=axes[r,c])
        count+=1

在此处输入图像描述

Using this code you can plot subplots in any configuration.使用此代码,您可以在任何配置中绘制子图。 You need to define the number of rows nrow and the number of columns ncol .您需要定义的行数nrow和列数ncol Also, you need to make list of data frames df_list which you wanted to plot.此外,您需要制作要绘制的数据框df_list列表。

You can use this:你可以使用这个:

fig = plt.figure()
ax = fig.add_subplot(221)
plt.plot(x,y)

ax = fig.add_subplot(222)
plt.plot(x,z)
...

plt.show()

You may not need to use Pandas at all.您可能根本不需要使用 Pandas。 Here's a matplotlib plot of cat frequencies:这是 cat 频率的 matplotlib 图:

在此处输入图像描述

x = np.linspace(0, 2*np.pi, 400)
y = np.sin(x**2)

f, axes = plt.subplots(2, 1)
for c, i in enumerate(axes):
  axes[c].plot(x, y)
  axes[c].set_title('cats')
plt.tight_layout()

Building on @joris response above, if you have already established a reference to the subplot, you can use the reference as well.基于上面的@joris 响应,如果您已经建立了对子图的引用,您也可以使用该引用。 For example,例如,

ax1 = plt.subplot2grid((50,100), (0, 0), colspan=20, rowspan=10)
...

df.plot.barh(ax=ax1, stacked=True)

How to create multiple plots from a dictionary of dataframes with long (tidy) data如何从具有长(整齐)数据的数据框字典中创建多个图

  • Assumptions:假设:

    • There is a dictionary of multiple dataframes of tidy data有一个整洁数据的多个数据框的字典
      • Created by reading in from files通过从文件中读取创建
      • Created by separating a single dataframe into multiple dataframes通过将单个数据帧分成多个数据帧创建
    • The categories, cat , may be overlapping, but all dataframes may not contain all values of cat类别cat可能重叠,但所有数据帧可能不包含cat所有值
    • hue='cat'
  • Because dataframes are being iterated through, there's not guarantee that colors will be mapped the same for each plot由于正在迭代数据帧,因此不能保证每个绘图的颜色映射相同

    • A custom color map needs to be created from the unique 'cat' values for all the dataframes需要从所有数据帧的唯一'cat'值创建自定义颜色图
    • Since the colors will be the same, place one legend to the side of the plots, instead of a legend in every plot由于颜色相同,因此将一个图例放在图的一侧,而不是在每个图上放置一个图例

Imports and synthetic data导入和合成数据

import pandas as pd
import numpy as np  # used for random data
import random  # used for random data
import matplotlib.pyplot as plt
from matplotlib.patches import Patch  # for custom legend
import seaborn as sns
import math import ceil  # determine correct number of subplot


# synthetic data
df_dict = dict()
for i in range(1, 7):
    np.random.seed(i)
    random.seed(i)
    data_length = 100
    data = {'cat': [random.choice(['A', 'B', 'C']) for _ in range(data_length)],
            'x': np.random.rand(data_length),
            'y': np.random.rand(data_length)}
    df_dict[i] = pd.DataFrame(data)


# display(df_dict[1].head())

  cat         x         y
0   A  0.417022  0.326645
1   C  0.720324  0.527058
2   A  0.000114  0.885942
3   B  0.302333  0.357270
4   A  0.146756  0.908535

Create color mappings and plot创建颜色映射和绘图

# create color mapping based on all unique values of cat
unique_cat = {cat for v in df_dict.values() for cat in v.cat.unique()}  # get unique cats
colors = sns.color_palette('husl', n_colors=len(unique_cat))  # get a number of colors
cmap = dict(zip(unique_cat, colors))  # zip values to colors

# iterate through dictionary and plot
col_nums = 3  # how many plots per row
row_nums = math.ceil(len(df_dict) / col_nums)  # how many rows of plots
plt.figure(figsize=(10, 5))  # change the figure size as needed
for i, (k, v) in enumerate(df_dict.items(), 1):
    plt.subplot(row_nums, col_nums, i)  # create subplots
    p = sns.scatterplot(data=v, x='x', y='y', hue='cat', palette=cmap)
    p.legend_.remove()  # remove the individual plot legends
    plt.title(f'DataFrame: {k}')

plt.tight_layout()
# create legend from cmap
patches = [Patch(color=v, label=k) for k, v in cmap.items()]
# place legend outside of plot; change the right bbox value to move the legend up or down
plt.legend(handles=patches, bbox_to_anchor=(1.06, 1.2), loc='center left', borderaxespad=0)
plt.show()

在此处输入图像描述

Here is a working pandas subplot example, where modes is the column names of the dataframe.这是一个工作的熊猫子图示例,其中模式是数据框的列名。

    dpi=200
    figure_size=(20, 10)
    fig, ax = plt.subplots(len(modes), 1, sharex="all", sharey="all", dpi=dpi)
    for i in range(len(modes)):
        ax[i] = pivot_df.loc[:, modes[i]].plot.bar(figsize=(figure_size[0], figure_size[1]*len(modes)),
                                                   ax=ax[i], title=modes[i], color=my_colors[i])
        ax[i].legend()
    fig.suptitle(name)

Pandas 子图栏示例

import numpy as np
import pandas as pd
imoprt matplotlib.pyplot as plt

fig, ax = plt.subplots(2,2)
df = pd.DataFrame({'A':np.random.randint(1,100,10), 
                   'B': np.random.randint(100,1000,10),
                   'C':np.random.randint(100,200,10)})
for ax in ax.flatten():
    df.plot(ax =ax)  


输出

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM