简体   繁体   English

从 pandas.core.groupby.generic.DataFrameGroupBy 中删除空的 dataframe

[英]Remove empty dataframe from pandas.core.groupby.generic.DataFrameGroupBy

How can delete empty dataframes from pandas.core.groupby.generic.DataFrameGroupBy?如何从 pandas.core.groupby.generic.DataFrameGroupBy 中删除空数据帧?

my aggregation code:我的聚合代码:

cols = ["col1", "col2","col3","col4"]  
joined = pd.concat(df.reset_index() for df in collectData)
joined = joined.replace({np.nan:1, 0:1})
joined[cols] = joined[cols].mask(joined[cols] < 0, 1)

df = joined.set_index('sensor').groupby(pd.Grouper(freq='D'))

data after grouping:分组后的数据:

list(df)

[(Timestamp('2020-02-04 00:00:00+0000', tz='UTC', freq='D'),
                                 col1       col2      col3    col4  
  sensor                                                                   
  2020-02-04 00:00:00+00:00    2.586569   0.015321  0.000149    0.884470   
  2020-02-04 00:00:00+00:00    4.429571   4.049798  1.820845    2.882445   
  2020-02-04 00:00:00+00:00   12.883314   6.900607  1.002138    3.613021    
  ...                               ...        ...       ...         ...    
  2020-02-04 23:45:00+00:00    3.798017   1.605979  0.176515    2.400820   
  2020-02-04 23:45:00+00:00    5.546771   2.232437  0.233292    3.750547   
  2020-02-04 23:45:00+00:00    4.910360   3.730932  0.985459    1.238469       
  
  [48945 rows x 4 columns]),
 (Timestamp('2020-02-05 00:00:00+0000', tz='UTC', freq='D'),
  Empty DataFrame
  Columns: [col1, col2, col3, col4]
  Index: []),
 (Timestamp('2020-02-06 00:00:00+0000', tz='UTC', freq='D'),
  Empty DataFrame
  Columns: [col1, col2, col3, col4]]
  Index: []),
 (Timestamp('2020-02-07 00:00:00+0000', tz='UTC', freq='D'),
                                 col1       col2      col3    col4  
  sensor                                                                   
  2020-02-07 00:00:00+00:00   17.065174   3.065422  0.171053    9.048574   
  2020-02-07 00:00:00+00:00   30.181997  20.651204  4.413567   15.200674   
  2020-02-07 00:00:00+00:00    1.864378   1.726365  0.819459    1.441588   
  ...                               ...        ...       ...         ...   
  2020-02-07 23:45:00+00:00   39.644320   0.234830  0.002289   13.642480   
  2020-02-07 23:45:00+00:00   30.778517  10.540318  0.944788   13.165241   
  2020-02-07 23:45:00+00:00   34.610439  25.342142  6.184292   22.725937      
  
  [50112 rows x 4 columns]),]

size of df df.size() : df df.size()的大小:

sensor
2020-02-02 00:00:00+00:00    47574
2020-02-03 00:00:00+00:00    49353
2020-02-04 00:00:00+00:00    48945
2020-02-05 00:00:00+00:00        0
2020-02-06 00:00:00+00:00        0
                             ...  
2020-09-26 00:00:00+00:00    83680
2020-09-27 00:00:00+00:00    84293
2020-09-28 00:00:00+00:00    84873
2020-09-29 00:00:00+00:00    84306
2020-09-30 00:00:00+00:00    84875
Freq: D, Length: 242, dtype: int64

I need to remove the empty dataframes before applying std = df.apply(gstd) .在应用std = df.apply(gstd)之前,我需要删除空数据框。 I don't know the location of empty dataframe.不知道空dataframe的位置。 https://stackoverflow.com/a/51052536/14338086 and https://stackoverflow.com/a/16916611/14338086 return error. https://stackoverflow.com/a/51052536/14338086https://stackoverflow.com/a/16916611/14338086返回错误。 Also using df.filter(lambda x: x.size() != 0) returns TypeError: 'numpy.int64' object is not callable .同样使用df.filter(lambda x: x.size() != 0)返回TypeError: 'numpy.int64' object is not callable dropna() is not available. dropna()不可用。

I solved the question by the following code, maybe it helps someone.我通过以下代码解决了这个问题,也许它可以帮助某人。

cols = [" col1", "col2", "col3", "col4"]
   
joined = pd.concat(df.reset_index() for df in collectData)
joined = joined.replace({np.nan:1, 0:1})
joined[cols] = joined[cols].mask(joined[cols] < 0, 1)

df = joined.set_index('sensor').groupby(pd.Grouper(freq='D'))
dff = pd.concat(map(lambda x: x[1], df))
means = dff.groupby(dff.index.floor('d')).agg(gmean)
std = dff.groupby(dff.index.floor('d')).agg(gstd)

df_result = pd.merge (left=means, right=std, how='left', on='sensor')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM