简体   繁体   中英

How to divide two columns with different sizes (Pandas)?

I have two dataframes that are spectral measurements (both have two columns: Intensity and Wavelength ) and I need to divide the intensity of one by the intensity of the other in a given Wavelength, as if I were dividing two functions (I1 (λ) / I2 (λ)) . The difficulty is that both dataframes have different sizes and the Wavelength values ​​for one are not exactly the same as the other (although obviously they "go close").

1

One has approximately 200 lines (black line) and the other has 3648 (red line). In short, the red graph is much more "filled" than the black graph, but as I said before, the Wavelength values ​​of the respective dataframes are not exactly the same.

They have different Wavelength ranges as well:

  • Black starts from 300.2 to 795.5 nm
  • Red starts at 199.975 and goes up to 1027.43 nm

What I like to do is something like this:

2

Note that, I divided the Intensity of the black one by the red one, where the result with his corresponding Wavelength is added in a new df. Is it possible to generate a new dataframe with an equivalent Wavelength and make this division between intensities?

Here is working solution of your problem. My current assumption is that the sampling rate of instrument is the same. Since, you didn't provide any sample, I have generated some data. The answer is based on concatenating both dataframes on the Wavelength column.

import pandas as pd
import numpy as np

##generating the test data
black_lambda = np.arange(300.2,795.5,0.1)
red_lambda = np.arange(199.975,1027.43,0.1)

I_black = np.random.random((1,len(black_lambda))).ravel()
I_red = np.random.random((1,len(red_lambda))).ravel()

df = pd.DataFrame([black_lambda,I_black]).T
df1 = pd.DataFrame([red_lambda,I_red]).T
df.columns=['lambda','I_black']
df1.columns=['lambda','I_red']

Follow from here :

#setting lambda as index for both dataframes
df.set_index(['lambda'],inplace=True)
df1.set_index(['lambda'],inplace=True)

#concatenating/merging both dataframes into one
df3 = pd.concat([df,df1],axis=1)

#since both dataframes are not of same length, there will be some missing values. Taking care of them by filling previous values (optional). 
df3.fillna(method='bfill',inplace=True)
df3.fillna(method='ffill',inplace=True)

#creating a new column 'division' to finish up the task
df3['division'] = df3['I_black'] / df3['I_red']

print(df3)

Output :

           I_black     I_red  division
lambda                                
199.975   0.855777  0.683906  1.251308
200.075   0.855777  0.305783  2.798643
200.175   0.855777  0.497258  1.720993
200.275   0.855777  0.945699  0.904915
200.375   0.855777  0.910735  0.939655
...            ...       ...       ...
1026.975  0.570973  0.637064  0.896258
1027.075  0.570973  0.457862  1.247042
1027.175  0.570973  0.429709  1.328743
1027.275  0.570973  0.564804  1.010924
1027.375  0.570973  0.246437  2.316917

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM