简体   繁体   中英

How to swap two DataFrame columns?

In MATLAB, to swap the first and second columns of a table A , one would do this 1

A = A(:, [2 1 3:end]);

Is there a similarly convenient way to do this if A were a pandas DataFrame instead?

1 MATLAB uses 1-based indexing.

pandas has reindex method that does it. You just need to give a list with the column names in the order you wish:

columns_titles = ["B","A"]


A slight variant on acushner's answer:

# get a list of the columns
col_list = list(df)
# use this handy way to swap the elements
col_list[0], col_list[1] = col_list[1], col_list[0]
# assign back, the order will now be swapped
df.columns = col_list


In [39]:

df = pd.DataFrame({'a':randn(3), 'b':randn(3), 'c':randn(3)})
          a         b         c
0 -0.682446 -0.200654 -1.609470
1 -1.998113  0.806378  1.252384
2 -0.250359  3.774708  1.100771
In [40]:

col_list = list(df)
col_list[0], col_list[1] = col_list[1], col_list[0]
df.columns = col_list
          b         a         c
0 -0.682446 -0.200654 -1.609470
1 -1.998113  0.806378  1.252384
2 -0.250359  3.774708  1.100771


If you just want to change the column order without changing the column contents then you can reindex using fancy indexing:

In [34]:
cols = list(df)
cols[1], cols[0] = cols[0], cols[1]

['b', 'a', 'c']

In [35]:

          b         a         c
0 -0.200654 -0.682446 -1.609470
1  0.806378 -1.998113  1.252384
2  3.774708 -0.250359  1.100771
c = A.columns
A = A[c[np.r_[1, 0, 2:len(c)]]]

or, even easier:

A[[c[0], c[1]]] = A[[c[1], c[0]]]

*edit: fixed per Ivan's suggestions.

In my case, I have over 100 columns in my data frame. So instead list all columns, I wrote a short function to just switch two columns

def df_column_switch(df, column1, column2):
    i = list(df.columns)
    a, b = i.index(column1), i.index(column2)
    i[b], i[a] = i[a], i[b]
    df = df[i]
    return df

I finally settled for this:

A = A.iloc[:, [1, 0] + range(2, A.shape[1])]

It's far less convenient than the MATLAB version, but I like the fact that it does not require creating temporary variables.

If you have multiple columns and performance and memory are not an issue, you can simply use this function:

def swap_columns(df, c1, c2):
    df['temp'] = df[c1]
    df[c1] = df[c2]
    df[c2] = df['temp']
    df.drop(columns=['temp'], inplace=True)

I would use:

end = df.shape[1] # or len(df.columns)
df.iloc[:, np.r_[1, 0, 2:end]

For Dataframes in python, Considering that you have given the 2 columns, then:

#df is your data frame

df = df[[col1 if col == col2 else col2 if col == col1 else col for col in df.columns]]

column swap

import pandas as pd

df = pd.read_csv('/Users/parent/Desktop/Col_swap.csv')


columns_titles = ["A","B","C","E"]


df_reorder.to_csv('/Users/parent/Desktop/col_reorder1.csv', index=False)



    B   A   C   E
0  c1  a1  b1  d1
1  c2  a2  b2  d2

    A   B   C   E
0  a1  c1  b1  d1
1  a2  c2  b2  d2

You can use this easily

columns_titles = ["D","C","B","A"]

and then


This works for me in Python 3.x:

df = df.iloc[:, [1, 0] + list(range(2, df.shape[1]))]

Remember df = P.iloc[:, [1, 0] + range(2, P.shape[1])] wont work and will give the error:

TypeError: can only concatenate list (not "range") to list
colList = list(df.columns)
colList[0], colList[1] =  colList[1], colList[0]
df = df[colList]

A "one-liner" automating this popular answer (upvoted) with columns list inversion (but not in-place, hence .reverse() wasn't used):

reverse_df = df.reindex(columns=list(df.columns)[::-1])

As a bonus a unit test verifying that it worked:

assert (reverse_df.columns[::-1] == df.columns).all()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM