简体   繁体   中英

How to append selected columns to pandas dataframe from df with different columns

I want to be able to append df1 df2, df3 into one df_All , but since each of the dataframe has different column. How could I do this in for loop ( I have others stuff that i have to do in the for loop ) ?

import pandas as pd
import numpy as np

df1 = pd.DataFrame.from_items([('A', [1, 2, 3]), ('B', [4, 5, 6])])
df2 = pd.DataFrame.from_items([('B', [5, 6, 7]), ('A', [8, 9, 10])])
df3 = pd.DataFrame.from_items([('C', [5, 6, 7]), ('D', [8, 9, 10]), ('A',[1,2,3]), ('B',[4,5,7])])
list = ['df1','df2','df3']
df_All = pd.DataFrame()
for i in list:
   # doing something else as well --- 
    df_All = df_All.append(i)

在此处输入图片说明

I want my df_All to only have ( A & B ) only, is there a way to this in loop above ? something like append only this two columns ?

If I understand what you want then you need to select just columns 'A' and 'B' from df3 and then use pd.concat :

In [35]:

df1 = pd.DataFrame.from_items([('A', [1, 2, 3]), ('B', [4, 5, 6])])
df2 = pd.DataFrame.from_items([('B', [5, 6, 7]), ('A', [8, 9, 10])])
df3 = pd.DataFrame.from_items([('C', [5, 6, 7]), ('D', [8, 9, 10]), ('A',[1,2,3]), ('B',[4,5,7])])
df_list = [df1,df2,df3[['A','B']]]
pd.concat(df_list, ignore_index=True)
Out[35]:
    A  B
0   1  4
1   2  5
2   3  6
3   8  5
4   9  6
5  10  7
6   1  4
7   2  5
8   3  7

Note that in your original code this is poor practice:

list = ['df1','df2','df3']

This shadows the built in type list plus even if it was actually a valid var name like df_list you've created a list of strings and not a list of dfs.

If you want to determine the common columns then you can determine this using the np.intersection method on the columns:

In [39]:

common_cols = df1.columns.intersection(df2.columns).intersection(df3.columns)
common_cols
Out[39]:
Index(['A', 'B'], dtype='object')

You can also use set comprehension to join all common columns from an arbitrary list of DataFrames:

df_list = [df1, df2, df3]
common_cols = list(set.intersection(*(set(c) for c in df_list)))
df_new = pd.concat([df[common_cols] for df in df_list], ignore_index=True)
>>> df_new 
    A  B
0   1  4
1   2  5
2   3  6
3   8  5
4   9  6
5  10  7
6   1  4
7   2  5
8   3  7

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM