簡體   English   中英

如何在遞歸 function 中構建 pandas dataframe?

[英]How to build a pandas dataframe in a recursive function?

我正在嘗試在數據挖掘中實現“自下而上計算”算法( https://www.aaai.org/Papers/FLAIRS/2003/Flairs03-050.pdf )。

I need to use the 'pandas' library to create a dataframe and provide it to a recursive function, which should also return a dataframe as output. 我只能將最后一列返回為 output,因為我無法弄清楚如何動態構建數據框。

這是 python 程序:

import pandas as pd

def project_data(df, d):
    return df.iloc[:, d]

def select_data(df, d, val):
    col_name = df.columns[d]
    return df[df[col_name] == val]

def remove_first_dim(df):
    return df.iloc[:, 1:]

def slice_data_dim0(df, v):
    df_temp = select_data(df, 0, v)
    return remove_first_dim(df_temp)

def buc(df):
    dims = df.shape[1]
    if dims == 1:
        input_sum = sum(project_data(df, 0) )
        print(input_sum)
    else:
        dim_vals = set(project_data(df, 0).values)

        for dim_val in dim_vals:
            sub_data = slice_data_dim0(df, dim_val)
            buc(sub_data)
        sub_data = remove_first_dim(df)
        buc(sub_data)


data = {'A':[1,1,1,1,2],
        'B':[1,1,2,3,1],
        'M':[10,20,30,40,50]
        }
    
df = pd.DataFrame(data, columns = ['A','B','M'])
buc(df)

我得到以下 output:

30
30
40
100
50
50
80
30
40

但是我需要的是一個dataframe,像這樣(不一定是格式化的,而是一個數據框):

    A   B   M
0   1   1   30
1   1   2   30
2   1   3   40
3   1   ALL 100
4   2   1   50
5   2   ALL 50
6   ALL 1   80
7   ALL 2   30
8   ALL 3   40
9   ALL ALL 150

我如何實現這一目標?

不幸的是, pandas沒有進行小計的功能 - 所以訣竅是只計算它們並與原始 dataframe 連接在一起。

from itertools import combinations
import numpy as np

dim = ['A', 'B']
vals = ['M']

df = pd.concat(
    [df]
# subtotals:
    + [df.groupby(list(gr), as_index=False)[vals].sum() for r in range(len(dim)-1) for gr in combinations(dim, r+1)]
# total:
    + [df.groupby(np.zeros(len(df)))[vals].sum()]
    )\
    .sort_values(dim)\
    .reset_index(drop=True)\
    .fillna("ALL")

Output:

      A    B    M
0     1    1   10
1     1    1   20
2     1    2   30
3     1    3   40
4     1  ALL  100
5     2    1   50
6     2  ALL   50
7   ALL    1   80
8   ALL    2   30
9   ALL    3   40
10  ALL  ALL  150

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM