简体   繁体   中英

Multiply dataframe by specific value in another dataframe?

I have two data frames with a similar shape to:

df1 = pd.DataFrame([[3.2,5.8,46],[3.5,4.4,50],[5.4,6.7,40]], index = ['sample1','sample2','sample3'], columns = ['L1','L2','L3'])


L1  L2  L3
sample1 3.2 5.8 46
sample2 3.5 4.4 50
sample3 5.4 6.7 40


df2 = pd.DataFrame([[0.02,0.03,0.04,0.05,0.06],[0.2, 0.3, 0.4, 0.5, 0.7],[2, 3, 4, 5, 7]])


0   1   2   3   4
0   0.02    0.03    0.04    0.05    0.06
1   0.20    0.30    0.40    0.50    0.70
2   2.00    3.00    4.00    5.00    7.00


I would like to multiply the first row in df2 by the L1 value for sample 1 (3.2) in df1, then multiply the second row in df2 by the L2 value for sample 1 (5.8)in df1 and then multiply the third row in df2 by the L3 value for sample 1 (46) in df1. I would then need to repeat this for sample 2 (eg, row 1 by the L1 value for sample2, row 2 by the L2 value for sample2, and row3 by the L3 value for sample2.) And so on for each sample (with my actual dataset I have 100s of samples). With the creation of a new dataframe either for each sample or for all of the samples as the output. I'm not sure how to set the relevant code up?

Please check the following code

column_list = df1.columns
sample_list = df1.index

# Loop over samples and columns 
new_df = pd.DataFrame()
for sample in sample_list:
    for ind, column in enumerate(column_list):
        multiply_by_sample = df2.iloc[ind] * df1.loc[sample][column]
        new_df = new_df.append(multiply_by_sample, ignore_index=True)

Something like this,

sample_lists = {}
for df1_index, df1_row in df1.iterrows():
    sample = df1_index
    print(f'\nPROCESSING SAMPLE {sample}')
    df1_row = df1_row.tolist()
    sample_list = []
    for value in df1_row:
        index_number = df1_row.index(value)
        df2_row = df2.iloc[index_number, :].tolist()
        print(f'Mulitplying {df2_row} with {value}')
        int_list = [v*value for v in df2_row]
        sample_list.append(int_list)
    sample_lists[sample] = sample_list
print(f'\nFINAL OUTPUT: {sample_lists}')

Feel free to remove the print statements. You can then use this dict to create a dataframe .

Explanation:

  • Start loop
  • Take the first row in df1 and convert that to a list
  • For each value in that list, get the index of the value . This is done so that you can get the row that matches the index in df2 which will be our next step.
  • Get the row that matches the index in df2
  • Multiply the row with the value and append it to a list
  • Create a dict with the index of each row in df1 (sample1, sample2, etc.)

Pretty certain you can use lambda and apply to simplify the code above.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM