简体   繁体   English

Python和pandas,groupby只在DataFrame列

[英]Python and pandas, groupby only column in DataFrame

I would like to group some strings in the column called 'type' and insert them in a plotly bar, the problem is that from the new table created with groupby I can't extract the x and y to define them in the graph:我想在名为“类型”的列中对一些字符串进行分组并将它们插入 plotly 栏中,问题是从使用 groupby 创建的新表中我无法提取 x 和 y 以在图中定义它们:

tipol1 = df.groupby(['tipology']).nunique()

tipol1

the outpot gives me tipology as index and the grouping based on how many times they repeat outpot 给我 tipology 作为索引,并根据它们重复的次数进行分组

         number data
typology  
  one     2      113
  two     33     33
  three   12     88
  four    44     888
  five    11     66

in the number column (in which I have other values it gives me the correct grouping of the tipology column) Also in the date column it gives me values (I think grouping the dates but not the dates in the correct format) I also found:在数字列中(我有其他值,它给了我 tipology 列的正确分组)也在日期列中,它给了我值(我认为对日期进行分组,但不是正确格式的日期)我还发现:

tipol=df.groupby(['tipology']).nunique()
tipol2 = tipol[['number']]
tipol2

to take only the number column, but nothing to do, I would need the tipology column (not in index) and the column with the tipology grouping numbers to get the x and y axis to import it into plotly!只取数字列,但无所事事,我需要拓扑学列(不在索引中)和具有拓扑学分组数字的列来获取 x 和 y 轴以将其导入 plotly!

One last try I made (making a big mess):我做的最后一次尝试(弄得一团糟):

tipol=df.groupby(['tipology'],as_index=False).nunique()
tipol2 = tipol[['number']]


fig = go.Figure(data=[
go.Bar(name='test', x=df['tipology'], y=tipol2)

])

fig.update_layout(barmode='stack')
fig.show()

any suggestions thanks!任何建议谢谢!

UPDATE更新

I would have too much code to give an example, it would be difficult for me and it would waste your time too.我会有太多的代码来举个例子,这对我来说很困难,也会浪费你的时间。 basically I would need a groupby with the addition of a column that would show the grouping value eg:基本上我需要一个 groupby 并添加一个显示分组值的列,例如:

tipology    Date
home        10/01/18
home        11/01/18
garden      12/01/18
garden      12/01/18
garden      13/01/18
bathroom    13/01/18
bedroom     14/01/18
bedroom     15/01/18
kitchen     16/01/18
kitchen     16/01/18
kitchen     17/01/18

I wish this would happen: by deleting the date column and inserting the value column in the DataFrame that does the count我希望这会发生:通过删除日期列并在 DataFrame 中插入值列来进行计数

tipology   value
home         2
garden       3
bathroom     1
bedroom      2
kitchen      3

Then (I'm working with jupyer notebook) leaving the date column and adding the corresponding values to the value column based on their grouping:然后(我正在使用 jupyer notebook)离开日期列并根据分组将相应的值添加到值列:

  tipology       Date     value
   home        10/01/18     1
   home        11/01/18     1
   garden      12/01/18     2
   garden      12/01/18_____.
   garden      13/01/18     1
   bathroom    13/01/18     1
   bedroom     14/01/18     1
   bedroom     15/01/18     1
   kitchen     16/01/18     2
   kitchen     16/01/18_____.
   kitchen     17/01/18     1

I would need the columns to assign them to the x and y axes to import them to a graph!我需要这些列将它们分配给 x 轴和 y 轴,以便将它们导入到图表中! so none of the columns should be index所以没有一列应该是索引

By default the method groupby will return a dataframe where the fields you are grouping on will be in the index of the dataframe. You can adjust this behaviour by setting as_index=False in the group by.默认情况下,方法groupby将返回一个 dataframe,其中您分组的字段将在 dataframe 的索引中。您可以通过在 group by 中设置as_index=False来调整此行为。 Then tipology will still be a column in the dataframe that is returned:然后tipology仍然是返回的 dataframe 中的一列:

tipol1 = df.groupby('tipology', as_index=False).nunique()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM