简体   繁体   English

从 pandas.core.groupby.generic.DataFrameGroupBy object 获取值

[英]Get values from pandas.core.groupby.generic.DataFrameGroupBy object

Hi So I have dataframe like this, which has 71 unique values in time column, 721 unique values in lat column, 1440 unique values in lon column and all the values in temp column are unique.嗨所以我有这样的 dataframe,它在time列中有 71 个唯一值,在lat列中有 721 个唯一值,在lon列中有 1440 个唯一值,并且在 temp 列中的所有值都是唯一的。

Dataframe Sample: Dataframe 样品:

  time        latitude  longitude       temp
1950-01-01      90.0     0.00         49654.792969
1950-01-01      90.0     0.25         49654.792969
   .              .       .                .
   .              .       .                .
73715040 rows * 4 cloumn

Now I want to groupby using lat and lon column to get all the values of temp across all time period for all grid or pairs which will have 1038240 rows(721 lat*1440 lon) , so I'm doing like this.现在我想使用latlon列进行分组,以获取所有时间段内所有网格或对的所有 temp 值,这些网格或对将具有1038240 rows(721 lat*1440 lon) ,所以我这样做。

df = df.groupby(['latitude', 'longitude'])

Now as it is pandas.core.groupby.generic.DataFrameGroupBy object I'm not able to access the values from it.现在因为它是pandas.core.groupby.generic.DataFrameGroupBy object我无法从中访问值。 So I'm trying to convert it into dataframe by df.apply(pd.DataFrame) but this is taking lot of time and my kernel is getting crash.所以我试图通过df.apply(pd.DataFrame)将它转换为 dataframe 但这需要很多时间,而且我的 kernel 正在崩溃。 So is there any other way to get the records, or am I doing something wrong here.那么有没有其他方法可以获取记录,或者我在这里做错了什么。 Please suggest alternate way if possible.如果可能,请建议替代方式。

The object type pandas.core.groupby.generic.DataFrameGroupBy is a list of tuples, where the first element is the groupby element and the second the dataframe for that group. object 类型pandas.core.groupby.generic.DataFrameGroupBy是一个元组列表,其中第一个元素是 groupby 元素,第二个元素是该组的 Z6A8064B5DF479455500553C47C5505。

See the example below:请参见下面的示例:

Creating test dataframe创建测试 dataframe

import pandas as pd

df = pd.DataFrame({"ColA": [1,1,1,2,2,3,3,3], "ColB": [5,5,6,7,7,8,8,9], "ColC": [1,2,3,4,5,6,7,8]})

The test dataframe测试 dataframe

>>> df
   ColA  ColB  ColC
0     1     5     1
1     1     5     2
2     1     6     3
3     2     7     4
4     2     7     5
5     3     8     6
6     3     8     7
7     3     9     8

Grouping dataframe分组 dataframe

>>> groups = df.groupby(["ColA", "ColB"])

>>> type(groups)
<class 'pandas.core.groupby.generic.DataFrameGroupBy'>

Showing results显示结果

>>> for group in groups:
...     g, value = group
...     print(f"Key = {g}")
...     print(value)
...     print(80*"-")
...
Key = (1, 5)
   ColA  ColB  ColC
0     1     5     1
1     1     5     2
--------------------------------------------------------------------------------
Key = (1, 6)
   ColA  ColB  ColC
2     1     6     3
--------------------------------------------------------------------------------
Key = (2, 7)
   ColA  ColB  ColC
3     2     7     4
4     2     7     5
--------------------------------------------------------------------------------
Key = (3, 8)
   ColA  ColB  ColC
5     3     8     6
6     3     8     7
--------------------------------------------------------------------------------

IMPORTANT重要的

As commented by @HenriChab, using aggregate or, for example, sum will return a dataframe type not a group type正如@HenriChab 所评论的那样,使用aggregate或例如sum将返回 dataframe 类型而不是组类型

>>> new_df = df.groupby(["ColA", "ColB"]).sum()
>>> new_df
           ColC
ColA ColB
1    5        3
     6        3
2    7        9
3    8       13
     9        8

Finally you can reset the index.最后,您可以重置索引。

>>> new_df.reset_index(inplace=True)

>>> new_df
   ColA  ColB  ColC
0     1     5     3
1     1     6     3
2     2     7     9
3     3     8    13
4     3     9     8

This should work for you:这应该适合你:

df.groupby(['latitude', 'longitude']).aggregate(lambda x: ','.join(map(str, x)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM