I have two pandas dataframes a_df and b_df. a_df has columns ID, atext, and var1-var25, while b_df has columns ID, atext, and var1-var 25.
I want to add ONLY the corresponding vars from a_df and b_df and leave ID, and atext alone.
The code below adds ALL the corresponding columns. Is there a way to get it to add just the columns of interest?
absum_df=a_df.add(b_df)
What could I do to achieve this?
Use filter
:
absum_df = a_df.filter(like='var').add(b_df.filter(like='var'))
If you want to keep additional columns as-is, use concat
after summing:
absum_df = pd.concat([a_df[['ID', 'atext']], absum_df], axis=1)
Alternatively, instead of subselecting columns from a_df
, you could instead just drop the columns in absum_df
, if you want to add all columns from a_df
not in absum_df
:
absum_df = pd.concat([a_df.drop(absum_df.columns axis=1), absum_df], axis=1)
You can subset a dataframe to particular columns:
var_columns = ['var-{}'.format(i) for i in range(1,26)]
absum_df=a_df[var_columns].add(b_df[var_columns])
Note that this will result in a dataframe with only the var columns. If you want a dataframe with the non-var columns from a_df, and the var columns being the sum of a_df and b_df, you can do
absum_df = a_df.copy()
absum_df[var_columns] = a_df[var_columns].add(b_df[var_columns])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.