如何并排合并两个数据框？

Question

is there a way to conveniently merge two data frames side by side?有没有办法方便地并排合并两个数据框？

both two data frames have 30 rows, they have different number of columns, say, df1 has 20 columns and df2 has 40 columns.两个数据框都有 30 行，它们有不同的列数，例如，df1 有 20 列，df2 有 40 列。

how can i easily get a new data frame of 30 rows and 60 columns?如何轻松获得 30 行 60 列的新数据框？

df3 = pd.someSpecialMergeFunct(df1, df2)

or maybe there is some special parameter in append或者可能有一些特殊的参数附加

df3 = pd.append(df1, df2, left_index=False, right_index=false, how='left')

ps: if possible, i hope the replicated column names could be resolved automatically. ps：如果可能的话，我希望复制的列名可以自动解析。

thanks!谢谢！

Answer 1

You can use the concat function for this ( axis=1 is to concatenate as columns):您可以为此使用concat函数（ axis=1是连接为列）：

pd.concat([df1, df2], axis=1)

See the pandas docs on merging/concatenating: http://pandas.pydata.org/pandas-docs/stable/merging.html请参阅有关合并/连接的 pandas 文档：http: //pandas.pydata.org/pandas-docs/stable/merging.html

Answer 2

I came across your question while I was trying to achieve something like the following:我在尝试实现以下目标时遇到了您的问题：

So once I sliced my dataframes, I first ensured that their index are the same.因此，一旦我对数据帧进行切片，我首先要确保它们的索引是相同的。 In your case both dataframes needs to be indexed from 0 to 29. Then merged both dataframes by the index.在您的情况下，两个数据帧都需要从 0 到 29 进行索引。然后通过索引合并两个数据帧。

df1.reset_index(drop=True).merge(df2.reset_index(drop=True), left_index=True, right_index=True)

Answer 3

如果要将 2 个数据框与公共列名组合在一起，可以执行以下操作：

df_concat = pd.merge(df1, df2, on='common_column_name', how='outer')

Answer 4

I found that the other answers didn't cut it for me when coming in from Google.当我从谷歌进来时，我发现其他答案并没有为我解决问题。

What I did instead was to set the new columns in place in the original df.我所做的是将新列设置在原始 df 中。

# list(df2.columns) gives you the column names of df2
# you then use these as the column names for df

df[ list(df2.columns) ] = df2

Answer 5

There is way, you can do it via a Pipeline.有办法，你可以通过管道来做到这一点。

** Use a pipeline to transform your numerical Data for ex- ** 使用管道将您的数字数据转换为 ex-

Num_pipeline = Pipeline
([("select_numeric", DataFrameSelector([columns with numerical value])),
("imputer", SimpleImputer(strategy="median")),
])

**And for categorical data **对于分类数据

cat_pipeline = Pipeline([
    ("select_cat", DataFrameSelector([columns with categorical data])),
    ("cat_encoder", OneHotEncoder(sparse=False)),
])

** Then use a Feature union to add these transformations together ** 然后使用 Feature union 将这些转换加在一起

preprocess_pipeline = FeatureUnion(transformer_list=[
    ("num_pipeline", num_pipeline),
    ("cat_pipeline", cat_pipeline),
])

Read more here - https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.FeatureUnion.html在这里阅读更多 - https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.FeatureUnion.html

Answer 6

如果df1和df2具有不同的索引，此解决方案也适用：

df1.loc[:, df2.columns] = df2.to_numpy()

如何并排合并两个数据框？

问题描述

6 个解决方案

解决方案1
125 2014-05-27 14:04:28

解决方案2
13 2018-06-14 20:34:18

解决方案3
4 2021-05-07 14:43:13

解决方案4
1 2021-03-23 16:12:26

解决方案5
0 2019-09-18 12:51:13

解决方案6
0 2022-05-28 19:30:12

如何并排合并两个数据框？

问题描述

6 个解决方案

解决方案1 125 2014-05-27 14:04:28

解决方案2 13 2018-06-14 20:34:18

解决方案3 4 2021-05-07 14:43:13

解决方案4 1 2021-03-23 16:12:26

解决方案5 0 2019-09-18 12:51:13

解决方案6 0 2022-05-28 19:30:12

解决方案1
125 2014-05-27 14:04:28

解决方案2
13 2018-06-14 20:34:18

解决方案3
4 2021-05-07 14:43:13

解决方案4
1 2021-03-23 16:12:26

解决方案5
0 2019-09-18 12:51:13

解决方案6
0 2022-05-28 19:30:12