簡體   English   中英

熊貓如何划分以獲得不同兩個數據幀的比率

[英]pandas how to divide to get ratio for different two dataframe

df1 是參考,df2 是目標。 df2 的 TYPE 列值應該被忽略,只是保持值不變.. 因為 df2 中的 TYPE 列,我不能直接划分。 如何比較兩個數據幀並從參考數據幀中獲得比率。 我必須保留 df2 數據框,並只獲取有關“總和”的比率值更新..

python

 import numpy as np
 import pandas as pd
 df_data = {}
 df_data['ID'] = [100001,100002,100003,100004]
 df_data['ID2'] = ['A','B','C','D']
 df_data['sum'] = [7,8,4,5]
 df = pd.DataFrame(df_data)
 print(df)

 df_data2= {}
 df_data2['ID'] = [100001,100002,100002,100003,100003,100001,100002]
 df_data2['ID2'] = ['G','H','Q','J','H','A','B']
 df_data2['TYPE'] = ['A','A','B','A','B','C','E']
 df_data2['sum'] = [14,4,4,2,8,100,10]
 df2 = pd.DataFrame(df_data2)
 print(df2)

 # my trying. I can get value but df2's dataframe is broken. I can't find value for TYPE column..
 df.set_index(['ID','ID2'])['sum'] / df.set_index(['ID','ID2'])['sum']
#printout df
       ID ID2  sum
0  100001   A    7
1  100002   B    8
2  100003   C    4
3  100004   D    5
#print df2
       ID ID2 TYPE  sum
0  100001   G    A   14
1  100002   H    A    4
2  100002   Q    B    4
3  100003   J    A    2
4  100003   H    B    8
5  100001   A    C  100
6  100002   B    E   10

# my goal
       ID ID2 TYPE  sum
0  100001   G    A   N/A  # There is no value ( ID:100001 ID2:G)
1  100002   H    A   N/A  # There is no value ( ID:100002 ID2:H)
2  100002   Q    B   N/A  # There is no value ( ID:100002 ID2:Q)
3  100003   J    A   N/A
4  100003   H    B   N/A
5  100001   A    C   25.0  # There is value ( ID:100001 ID2:A)
6  100002   B    E   2.0   # There is value ( ID:100002 ID2:B)

#my trying
ID      ID2
100001  A      14.285714
        G            NaN
100002  B       1.250000
        H            NaN
        Q            NaN
100003  C            NaN
        H            NaN
        J            NaN
100004  D            NaN

這可以merge

df2['sum'] = (df2.merge(df, on=['ID','ID2'],
                        how='left')
                 .assign(sum=lambda x: x.sum_x/x.sum_y)
                 ['sum']
             )

輸出:

       ID ID2 TYPE        sum
0  100001   G    A        NaN
1  100002   H    A        NaN
2  100002   Q    B        NaN
3  100003   J    A        NaN
4  100003   H    B        NaN
5  100001   A    C  14.285714
6  100002   B    E   1.250000

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM