简体   繁体   中英

Python/Pandas: Divide numeric columns from different dataframes based on a common row identifier and unique row-col combination

I would like to calculate the rate of change between the numeric columns of two dataframes based on a common unique row identifier and unique row-column combination.

Here is an example. I opted to present the tables as images in order to use colors to highlight the peculiarities of the two datasets. That is, each dataframe contains numeric and non-numeric columns, and rows and columns may not be in the same order. Also, the numeric columns on which the calculation should take place are always those after the 'Time' column.

F

The df.divide() approach doesn't work here because the rows and columns are not in the same order. I also saw the top answer in this thread, but again the approach doesn't generalize to mine.

If your problem is essentially with the columns and rows not being in the right order, that can be solved by essentially reordering the columns and rows.

#Identifying the columns for which the difference is to be computed. Since #'Time' is the 4th column, we take all columns after that
valCols = list(df.columns)[4:]

#Sorting the datasets so that the rows align
df1 = df1.sort_values('ID')
df2 = df2.sort_values('ID')

#Keeping only the value columns. This also ensures that the columns are in the same order now
df1 = df1[valCols]
df2 = df1[valCols]


The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM