简体   繁体   English

Pandas - 最大来自 DataFrame groupby return NAN

[英]Pandas - max from DataFrame groupby return NAN

I have problem with my dataframe.我的 dataframe 有问题。 I want get max values from one column of groupedby DataFrame, but i get only NaNs...我想从 DataFrame 分组的一列中获取最大值,但我只得到 NaN...

My Dataframe我的 Dataframe

  kod_ow      kod_sw  ... pr_kierunkowa           infrast_h_bloku
0     06  061/200324  ...               None        0.000000
1     06  061/200324  ...               None        0.000000
2     06  061/200324  ...               None      209.365495
3     06  061/200324  ...               None        0.000000
4     06  061/200324  ...               None        0.000000
5     06  061/200324  ...               None      209.365495

[6 rows x 8 columns]

I've tried with:我试过:

df['new'] = df.groupby(by=['kod_ow', 'kod_sw', 'nr_ks', 'nr_ks_pr', 'nazwa_zabiegu_icd_9', 'nazwa_zabiegu','pr_kierunkowa'])['infrast_h_bloku'].transform('max')

my result is:我的结果是:

  kod_ow      kod_sw  nr_ks  ... infrast_h_bloku osobodzien new
0     06  061/200324   3193  ...        0.000000        0.0 NaN
1     06  061/200324   3193  ...        0.000000        0.0 NaN
2     06  061/200324   3193  ...      209.365495        0.0 NaN
3     06  061/200324  54809  ...        0.000000        0.0 NaN
4     06  061/200324  54809  ...        0.000000        0.0 NaN
5     06  061/200324  54809  ...      209.365495        0.0 NaN

The question is, why max function put NAN in new column instead of real result???问题是,为什么 max function 将 NAN 放入新列而不是实际结果???

Can someone help me, what I've done wrong?有人可以帮助我,我做错了什么吗?

This is a similar example to get maximum col2 row for each id.这是为每个 id 获取最大 col2 行的类似示例。

# importing pandas as pd 
import pandas as pd 
   
# dictionary of lists 
dict = {'id':[1, 1, 2, 2], 
        'col1':[21, 40, 81, 98], 
        'col2':[30, 20, 80, 91],
        'col3':[90, 10, 41, 99]
       } 
  
# creating a dataframe from a dictionary  
df = pd.DataFrame(dict) 
  
df[df.groupby(by=['id'])['col2'].transform(max) == df['col2']]

So you can use this in your case like below:所以你可以在你的情况下使用它,如下所示:

 df.groupby(by=['kod_ow', 'kod_sw', 'nr_ks', 'nr_ks_pr', 'nazwa_zabiegu_icd_9', 'nazwa_zabiegu','pr_kierunkowa'])['infrast_h_bloku'].transform(max)==df['infrast_h_bloku']

You wrote df["new"].你写了 df["new"]。 Since "new" is not an existing column a new column is created.由于“新”不是现有列,因此会创建一个新列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM