簡體   English   中英

將一列中的數組除以 pandas 中不同 dataframe 中的另一列

[英]Divide the array in one column by another column from a different dataframe in pandas

我有 2 個數據df_1df_2

df_1:
 id  number  array_col
001    0     [0.084, 0.089, 0.047 ...]
002    0     [0.052, 0.036, 0.062 ...]
003    0     [0.087, 0.087, 0.051 ...]
.      .
.      .
100    0     [0.098, 0.089, 0.067 ...]

100 x 3
df_2:
 id  number  array_col
001    1     [0.012, 0.023, 0.034 ...]
001    2     [0.045, 0.056, 0.067 ...]
002    1     [0.078, 0.089, 0.091 ...]
002    2     [0.021, 0.032, 0.043 ...]
.      .
.      .
100    2     [0.054, 0.065, 0.076 ...]

200 x 3

我的目標是為每個唯一id更新array_col中的df_2 ,方法是將它們除以array_coldf_1以獲得相同的唯一id 我已經嘗試過以下方法,但它似乎沒有工作/更新df_2中的列。

for unique_id in df_2['id'].unique():
    for unique_number in [1, 2]:
        df_2.loc[(df_2['id'] == unique_id) &
                 (df_2['number'] == unique_number)]['array_col'].values[0] =\
            df_2.loc[(df_2['id'] == unique_id) &
                     (df_2['number'] == unique_number)]['array_col'].values[0] =\
            df_1.loc[(df_1['id'] == unique_id) &
                     (df_1['number'] == 0)]['array_col'].values[0]

將不勝感激任何幫助。 如果您需要任何其他信息,請告訴我。

您可以apply()對應array_col列表的numpy.divide()

df_1 = df_1.set_index('id')

df_2.array_col = df_2.apply(lambda row:
    np.divide(row.array_col, df_1.loc[row.id, 'array_col']),
    axis=1)

#     id  number                                          array_col
# 0  001       1  [0.14285714285714285, 0.25842696629213485, 0.7...
# 1  001       2  [0.5357142857142857, 0.6292134831460675, 1.425...
# 2  002       1       [1.5, 2.4722222222222223, 1.467741935483871]
# 3  002       2  [0.4038461538461539, 0.888888888888889, 0.6935...
# 4  100       2  [0.5510204081632653, 0.7303370786516854, 1.134...

供參考的樣本數據:

df_1 = pd.DataFrame({'id':['001','002','003','100'],'number':[0,0,0,0],'array_col':[[0.084,0.089,0.047]*200,[0.052,0.036,0.062]*200,[0.087,0.087,0.051]*200,[0.098,0.089,0.067]*200]})
df_2 = pd.DataFrame({'id':['001','001','002','002','100'],'number':[1,2,1,2,2],'array_col':[[0.012,0.023,0.034]*200,[0.045,0.056,0.067]*200,[0.078,0.089,0.091]*200,[0.021,0.032,0.043]*200,[0.054,0.065,0.076]*200]})

檢查這個是否有效

gp_df1=df_1.groupby('id')['array_col'].reset_index().rename(columns={'array_col':'array_col_1'})

df_2 = df_2.merge(gp_df1, on= 'id',how='left')
df_2['new_array_col']=df_2.array_col.div(df_2.array_col_1)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM