简体   繁体   English

无法使用seaborn barplot绘制数据框

[英]Unable to plot dataframe using seaborn barplot

I have been able to use pandas groupby to create a new DataFrame but I'm getting an error when I create a barplot . 我已经能够使用pandas groupby来创建一个新的DataFrame但是当我创建一个barplot时我遇到了一个错误。 The groupby command: groupby命令:

invYr = invoices.groupby(['FinYear']).sum()[['Amount']]

Which creates a new DataFrame that looks correct to me. 这会创建一个看起来正确的新DataFrame

New DataFrame invYr 新的DataFrame invYr

Running: 运行:

sns.barplot(x='FinYear', y='Amount', data=invYr)

I get the error: 我收到错误:

ValueError: Could not interperet input 'FinYear'

It appears that the issue is related to the index, being FinYear but unfortunately I have not been able to solve the issue even when using reindex . 似乎该问题与指数有关,即FinYear,但不幸的是,即使使用reindex我也无法解决问题。

import pandas as pd
import seaborn as sns

invoices = pd.DataFrame({'FinYear': [2015, 2015, 2014], 'Amount': [10, 10, 15]})
invYr = invoices.groupby(['FinYear']).sum()[['Amount']]

>>> invYr
         Amount
FinYear        
2014         15
2015         20

The reason that you are getting the error is that when you created invYr by grouping invoices , the FinYear column becomes the index and is no longer a column. 您收到错误的原因是,当您通过分组invoices创建invYr时, FinYear列将成为索引,不再是列。 There are a few solutions: 有几个解决方案:

1) One solution is to specify the source data directly. 1)一种解决方案是直接指定源数据。 You need to specify the correct datasource for the chart. 您需要为图表指定正确的数据源。 If you do not specify a data parameter, Seaborn does not know which dataframe/series has the columns 'FinYear' or 'Amount' as these are just text values. 如果您未指定data参数,则Seaborn不知道哪个数据框/系列具有“FinYear”或“Amount”列,因为这些只是文本值。 You must specify, for example, y=invYr.Amount to specify both the dataframe/series and the column you'd like to graph. 例如,您必须指定y=invYr.Amount来指定数据y=invYr.Amount /系列和您想要绘制的列。 The trick here is directly accessing the index of the dataframe. 这里的技巧是直接访问数据帧的索引。

sns.barplot(x=invYr.index, y=invYr.Amount)

2) Alternatively, you can specify the data source and then directly refer to its columns. 2)或者,您可以指定数据源,然后直接引用其列。 Note that the grouped data frame had its index reset so that the column again becomes available. 请注意,分组数据框的索引已重置,以便该列再次可用。

sns.barplot(x='FinYear', y='Amount', data=invYr.reset_index())

3) A third solution is to specify as_index=False when you perform the groupby , making the column available in the grouped dataframe. 3)第三种解决方案是在执行groupby时指定as_index=False ,使列在分组数据帧中可用。

invYr = invoices.groupby('FinYear', as_index=False).Amount.sum()
sns.barplot(x='FinYear', y='Amount', data=invYr)

All solutions above produce the same plot below. 以上所有解决方案均生成相同的图表。

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM