简体   繁体   English

如何从不同的熊猫数据框中选择选定的列

[英]How to multiply selected columns from different pandas dataframes

I have 3 pandas dataframes (similar to the below one). 我有3个pandas数据框(类似于下面的数据框)。 I have 2 lists list ID_1 = ['sdf', 'sdfsdf', ...] and list ID_2 = ['kjdf', 'kldfjs', ...] 我有2个列表, list ID_1 = ['sdf', 'sdfsdf', ...]list ID_2 = ['kjdf', 'kldfjs', ...]

Table1:
    ID_1    ID_2    Value
0   PUFPaY9 NdYWqAJ 0.002
1   Iu6AxdB qANhGcw 0.01
2   auESFwW jUEUNdw 0.2345
3   LWbYpca G3uZ_Rg 0.0835
4   8fApIAM mVHrayg 0.0295

Table2:
     ID_1    weight1 weight2 .....weightN
0   PUFPaY9     
1   Iu6AxdB     
2   auESFwW 
3   LWbYpca     

Table3:
    ID_2    weight1 weight2 .....weightN
0   PUFPaY9     
1   Iu6AxdB     
2   auESFwW     
3   LWbYpca     

I want to have one dataframe which should be calculated like, 我想有一个应该计算的数据框,

for each x ID_1 in list1:
    for each y ID_2 in list2:
        if x-y exist in Table1:
            temp_row = ( x[weights[i]].* y[weights[i]])
            # here i want one to one multiplication, x[weight1]*y[weight1] , x[weight2]*y[weight2]
            temp_row.append(value[x-y] in Table1)
            new_dataframe.append(temp_row)

return new_dataframe

The required new_dataframe should look like Table4: 所需的new_dataframe应该类似于表4:

Table4:
        weight1 weight2 weight3 .....weightN value
    0           
    1           
    2       
    3       

What I am able to do now is: 我现在能做的是:

new_df = df[(df.ID_1.isin(list1)) & (df.ID_2.isin(list2))] using this I am getting all valid ID_1 and ID_2 combination and values. new_df = df[(df.ID_1.isin(list1)) & (df.ID_2.isin(list2))]使用此方法,我将获得所有有效的ID_1ID_2组合和值。 But I have no idea, how I can get the multiplication of weights form both datafames ( without looping for each weight[i] )? 但是我不知道如何从两个数据帧中获得权重的乘积(而不为每个weight[i]循环)?

Now task is easier, I can iterate over the new_df and for each row in new_df , I will find weight[i to n] for ID_1 from table 2 and weight[i to n] for ID_2 from table3 . 现在任务变得更容易了,我可以遍历new_dffor each row in new_df new_df for each row in new_df ,我weight[i to n] for ID_1 from table 2找到weight[i to n] for ID_1 from table 2 weight[i to n] for ID_2 from table3 Then I can append their one-one multiplication with "value" from table1 to new FINAL_DF . 然后,我可以追加其one-one multiplication"value" from table1FINAL_DF But I don't want to loop and do, can we solve this using some smarter way? 但是我不想循环执行,我们可以使用更智能的方式解决此问题吗?

is that what you want? 那是你要的吗?

data = """\
ID_1
PUFPaY9     
aaaaaaa
Iu6AxdB     
auESFwW 
LWbYpca
"""
id1 = pd.read_csv(io.StringIO(data), delim_whitespace=True)

data = """\
ID_2   
PUFPaY9
Iu6AxdB
xxxxxxx
auESFwW
LWbYpca
"""
id2 = pd.read_csv(io.StringIO(data), delim_whitespace=True)

cols = ['weight{}'.format(i) for i in range(1,5)]
for c in cols:
    id1[c] = np.random.randint(1, 10, len(id1))
    id2[c] = np.random.randint(1, 10, len(id2))

id1.set_index('ID_1', inplace=True)
id2.set_index('ID_2', inplace=True)

df_mul = id1 * id2

Step by step: 一步步:

In [215]: id1
Out[215]:
         weight1  weight2  weight3  weight4
ID_1
PUFPaY9        8        9        1        1
aaaaaaa        6        1        9        2
Iu6AxdB        8        4        8        5
auESFwW        9        3        4        2
LWbYpca        7        7        1        8

In [216]: id2
Out[216]:
         weight1  weight2  weight3  weight4
ID_2
PUFPaY9        6        5        5        1
Iu6AxdB        1        5        4        5
xxxxxxx        1        2        6        4
auESFwW        3        9        5        5
LWbYpca        3        3        6        7

In [217]: id1 * id2
Out[217]:
         weight1  weight2  weight3  weight4
Iu6AxdB      8.0     20.0     32.0     25.0
LWbYpca     21.0     21.0      6.0     56.0
PUFPaY9     48.0     45.0      5.0      1.0
aaaaaaa      NaN      NaN      NaN      NaN
auESFwW     27.0     27.0     20.0     10.0
xxxxxxx      NaN      NaN      NaN      NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将来自两个不同熊猫数据帧的两列相乘 - multiply two columns from two different pandas dataframes 如何根据条件乘以不同数据帧的列 - How to multiply columns of different dataframes based on conditions Python 熊猫:将 2 个数据帧的 2 列与不同的日期时间索引相乘 - Python pandas: multiply 2 columns of 2 dataframes with different datetime index 如何将同一 position 中的列与两个不同的 pandas dataframe 相乘? - How to multiply columns in the same position from two different pandas dataframe? 将来自具有特定条件的两个不同数据帧的两个不同列相乘 - Multiply Two different columns from two different Dataframes with specific condition Pandas:如何从不同数据框中逐行减去值 - Pandas: How to subtract value from columns by rows from different dataframes 如何加入来自不同Pandas DataFrame的列? - How do I join columns from different Pandas DataFrames? 如何从 Python 中的两个不同 Pandas 数据帧 select 多列 - How to select multiple columns from two different Pandas dataframes in Python 如何将不同数据帧中的列汇总为熊猫中的单个数据帧 - how to sum up columns from different dataframes into a single dataframe in pandas 如何在熊猫中将两个具有不同列标签的数据框相乘? - How can I multiply two dataframes with different column labels in pandas?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM