简体   繁体   中英

efficient way to subset and rename data.frame columns together in Python

In R data.table you can subset and rename columns at the same time. You can also select a column multiple times and rename it at the time of selection. Extremely fast and convenient. Can you do the same thing in Python? So far, all i have been able to do is separately selecting and renaming columns. It's really a pain for such a simple operation!

For example, my data.table DT has four columns A, B, C, D. in R you can:

subset_DT = DT[,.(A, B, second_A = A, rename_D = D)]

This subsets columns A, B, A, D and at the same time renames the second A and D columns to second_A and rename_D columns. So that subset_DT would have four columns; A, B, second_A, rename_D.

how can I do this neatly (in one straight forward operation) in Python pandas without separating the subset and renaming operations?

try the following code:

df=df.rename(columns = {'second_A':'A','rename_D':'D'})

You can use assign :

df = pd.DataFrame([list('abcd')], columns=list('ABCD'))
#   A  B  C  D
#0  a  b  c  d

df[['A','B']].assign(second_A = df.A, rename_D = df.D)
#   A  B second_A rename_D
#0  a  b        a        d

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM