How to merge two dataframes side-by-side?

Question

is there a way to conveniently merge two data frames side by side?

both two data frames have 30 rows, they have different number of columns, say, df1 has 20 columns and df2 has 40 columns.

how can i easily get a new data frame of 30 rows and 60 columns?

df3 = pd.someSpecialMergeFunct(df1, df2)

or maybe there is some special parameter in append

df3 = pd.append(df1, df2, left_index=False, right_index=false, how='left')

ps: if possible, i hope the replicated column names could be resolved automatically.

thanks!

Answer 1

You can use the concat function for this ( axis=1 is to concatenate as columns):

pd.concat([df1, df2], axis=1)

See the pandas docs on merging/concatenating: http://pandas.pydata.org/pandas-docs/stable/merging.html

Answer 2

I came across your question while I was trying to achieve something like the following:

So once I sliced my dataframes, I first ensured that their index are the same. In your case both dataframes needs to be indexed from 0 to 29. Then merged both dataframes by the index.

df1.reset_index(drop=True).merge(df2.reset_index(drop=True), left_index=True, right_index=True)

Answer 3

如果要将 2 个数据框与公共列名组合在一起，可以执行以下操作：

df_concat = pd.merge(df1, df2, on='common_column_name', how='outer')

Answer 4

I found that the other answers didn't cut it for me when coming in from Google.

What I did instead was to set the new columns in place in the original df.

# list(df2.columns) gives you the column names of df2
# you then use these as the column names for df

df[ list(df2.columns) ] = df2

Answer 5

There is way, you can do it via a Pipeline.

** Use a pipeline to transform your numerical Data for ex-

Num_pipeline = Pipeline
([("select_numeric", DataFrameSelector([columns with numerical value])),
("imputer", SimpleImputer(strategy="median")),
])

**And for categorical data

cat_pipeline = Pipeline([
    ("select_cat", DataFrameSelector([columns with categorical data])),
    ("cat_encoder", OneHotEncoder(sparse=False)),
])

** Then use a Feature union to add these transformations together

preprocess_pipeline = FeatureUnion(transformer_list=[
    ("num_pipeline", num_pipeline),
    ("cat_pipeline", cat_pipeline),
])

Read more here - https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.FeatureUnion.html

Answer 6

如果df1和df2具有不同的索引，此解决方案也适用：

df1.loc[:, df2.columns] = df2.to_numpy()

How to merge two dataframes side-by-side?

Question

6 answers

solution1
125 2014-05-27 14:04:28

solution2
13 2018-06-14 20:34:18

solution3
4 2021-05-07 14:43:13

solution4
1 2021-03-23 16:12:26

solution5
0 2019-09-18 12:51:13

solution6
0 2022-05-28 19:30:12

How to merge two dataframes side-by-side?

Question

6 answers

solution1 125 2014-05-27 14:04:28

solution2 13 2018-06-14 20:34:18

solution3 4 2021-05-07 14:43:13

solution4 1 2021-03-23 16:12:26

solution5 0 2019-09-18 12:51:13

solution6 0 2022-05-28 19:30:12

solution1
125 2014-05-27 14:04:28

solution2
13 2018-06-14 20:34:18

solution3
4 2021-05-07 14:43:13

solution4
1 2021-03-23 16:12:26

solution5
0 2019-09-18 12:51:13

solution6
0 2022-05-28 19:30:12