简体   繁体   English

从多级Pandas数据框中获取Seaborn绘图时遇到问题

[英]Having trouble with a Seaborn Plot from a multilevel Pandas Dataframe

I'm working with a csv file that I've read into pandas using the following command: 我正在使用以下命令读取已读入熊猫的csv文件:

RawData = pd.read_csv(rawData_file_path, engine='python', header=[0,1])

This creates a DataFrame object where rows 1 and 2 are header rows in each column. 这将创建一个DataFrame对象,其中第1行和第2行是每一列中的标题行。 Something like this: 像这样:

-------------------------------
|    Group 1   |    Group 2   |
-------------------------------
|   A   |   B  |   A   |  B   |
-------------------------------
|  data | data |  data | data |
-------------------------------
|  data | data |  data | data |
-------------------------------

I'm trying to run a count plot with seaborn (sns.countplot) but am running into issues because the 2nd row header is not being viewed as a header. 我正在尝试使用seaborn(sns.countplot)运行计数图,但是由于第二行标题没有被视为标题而遇到了问题。 The column I'm trying to analyze is a simple gender column (male / female). 我要分析的列是一个简单的性别列(男性/女性)。 However, based on how the results are laid out, column header looks like this: 但是,根据结果的布局方式,列标题看起来像这样:

row 1: What is your gender? 
row 2: Response 
row n: Male or Female etc.

I try to plot this using the countplot: 我尝试使用countplot绘制此图:

sns.countplot(x=['What is your gender?'], data=RawData)

However, I get this error: ValueError: The truth value of a DataFrame is ambiguous. 但是,我收到此错误:ValueError:DataFrame的真值不明确。

Use a.empty, a.bool(), a.item(), a.any() or a.all().

When I flattened the dataframe, the seaborn plot worked, but instead of mapping Male and Female counts, it mapped Male, Female and 'Response' counts. 当我展平数据框时,seaborn图起作用了,但是它没有映射Male和Female计数,而是映射了Male,Female和'Response'计数。 Which has led me to believe that the 2nd row of the header is what is causing the Value Error in the unflattend DataFrame. 这使我相信标题的第二行是导致未展平的DataFrame中的Value Error的原因。

This is the first plot of many I will have to make, and some of the latter columns are more complex and will require that 2nd row as a reference in the header. 这是我将要做的许多事情的第一张图,后面的一些列会更复杂,并且需要第二行作为标题中的引用。 As such, I can't simply flatten the DataFrame. 因此,我不能简单地将DataFrame展平。

Can anyone suggest a work around here? 有人可以建议在这里解决吗? I'd like to nip this in the bud now, with a simple count plot, before I have to start the more complex visualizations such as heatmaps, etc. 现在,我想用一个简单的计数图将其扼杀在萌芽中,然后再开始进行更复杂的可视化处理,例如热图等。

Seaborn functions like countplot assume that you have tidy data . 诸如countplot类的Seaborn函数假定您的数据整洁 Briefly: each variable should be a column, and each observation should be a row. 简要地说:每个变量应该是一列,每个观察值应该是一行。 You will want to find a way to format your dataframe so that it is in this basic structure, and then you will be able to use seaborn to plot it. 您将希望找到一种格式化数据框的方法,使其处于此基本结构中,然后您将能够使用seaborn对其进行绘制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM