I have multiple DataFrames that have the same format. I want to create a dataframe that combine the previous ones. each row of the result dataframe is a row of one of the previous dataframes where a certain column is the maximum,
Example
data1 :
Name Age
0 michael 18
1 lincoln 20
2 theodore 84
3 alexandre 95
data2 :
Name Age
0 sayed 17
1 hurley 29
2 sawyer 44
3 John 15
data3 :
Name Age
0 walter 50
1 jesse 15
2 fring 20
3 saul 34
the expected result would be:
Results :
Name Age
0 walter 50
1 hurley 29
2 theodore 84
3 alexandre 95
I have more than 500.000 rows and 51 columns i'm looking for something faster than just parsing all the data (O(n2) of complexity is so big)
thank you.
You can use np.where
to choose the max value between column of dataframes. Then apply this to all columns of dataframe. At last use reduce()
to apply on all dataframes.
import functools
columns = df_.columns
df_ = pd.DataFrame(columns=columns)
def choose_larger(df1, df2):
m = df1['Age'] > df2['Age']
for col in columns:
df_[col] = np.where(m, df1[col], df2[col])
return df_
# Another possible function
def choose_larger2(df1, df2):
m = df1['Age'] > df2['Age']
m = pd.concat([m]*len(columns), axis=1)
return pd.DataFrame(np.where(m, df1, df2), columns=columns)
df_max = functools.reduce(lambda df1, df2: choose_larger(df1, df2), [data1, data2, data3])
print(df_max)
Name Age
0 michael 18
1 lincoln 20
2 theodore 84
3 alexandre 95
If you stack the dataframes horizontally:
dfs = [df.add_suffix(index) for index, df in enumerate([data1, data2, data3])]
df = pd.concat(dfs, axis=1)
You can use idxmax()
to find the column indexes
of the max Age
per row:
indexes = df.filter(like='Age').idxmax(axis=1)
Then indexes
will give every max Age
and shift()
will give each corresponding Name
:
pd.DataFrame({'Name': np.diag(df.shift(axis=1)[indexes]), 'Age': np.diag(df[indexes])})
# Name Age
# 0 walter 50
# 1 hurley 29
# 2 theodore 84
# 3 alexandre 95
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.