简体   繁体   中英

what is the most efficient way to move a column in a dataframe

I want to move a column on the right to some place on the left of the data frame. Since I am moving only one column and I have many columns. I think reordering like this df <- df[,c("a","b","d","c")] won't be efficient. Since the dataframe contains many rows, I want to minimize rewriting things

from:

 name var1 var2 var3 var4 var5 ... varN
 a     1    1    1    1    1        1
 b     1    1    1    1    1        1
 c     1    1    1    1    1        1

to:

  name var1 varN var2 var3 var4 ... varN-1
   a     1    1    1    1    1        1
   b     1    1    1    1    1        1
   c     1    1    1    1    1        1

You can use a vector of column indices rather than a vector of column names, so you can take advantage of sequence notation, like so:

my_seq = c(1,ncol(df),2:(ncol(df)-1))
df[,my_seq]

For example, if your dataframe has 17 columns, we get:

> my_seq
 [1]  1 17  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16

You can get there with append :

df <- data.frame(name=letters[1:5],var1=1,var2=1,var3=1,var4=1,var5=1)

# using names
df[append(names(df)[-ncol(df)], names(df)[ncol(df)], after=2)]

# using positions
df[append(seq(ncol(df)-1), ncol(df), after=2)]

#  name var1 var5 var2 var3 var4
#1    a    1    1    1    1    1
#2    b    1    1    1    1    1
#3    c    1    1    1    1    1
#4    d    1    1    1    1    1
#5    e    1    1    1    1    1

I would recommend looking at the moveMe function from my "SOfun" package .

With it, the solution would be something like:

df <- data.frame(name=letters[1:5],var1=1,var2=1,var3=1,var4=1,var5=1)

library(SOfun)

df[moveMe(names(df), "var5 before var4")]
#   name var1 var2 var3 var5 var4
# 1    a    1    1    1    1    1
# 2    b    1    1    1    1    1
# 3    c    1    1    1    1    1
# 4    d    1    1    1    1    1
# 5    e    1    1    1    1    1

You can also compound statements:

df[moveMe(names(df), "var5 before var2; name last")]
#   var1 var5 var2 var3 var4 name
# 1    1    1    1    1    1    a
# 2    1    1    1    1    1    b
# 3    1    1    1    1    1    c
# 4    1    1    1    1    1    d
# 5    1    1    1    1    1    e

If you want to do this most efficiently, you should consider converting your data to a "data.table", and using setcolorder . This would change the column order by reference, and not by making copies of your data.

library(data.table)
dt <- as.data.table(df)

setcolorder(dt, moveMe(names(dt), "var5 before var4"))
dt
#    name var1 var2 var3 var5 var4
# 1:    a    1    1    1    1    1
# 2:    b    1    1    1    1    1
# 3:    c    1    1    1    1    1
# 4:    d    1    1    1    1    1
# 5:    e    1    1    1    1    1

dplyr

df %>% select(name,var1,varN,everthing())

If data frame df has n columns and you have to move m th column to 2nd position from start

df <- subset(df, select=c(1, m, 2:m-1, m+1:n))

In your case:

df <- subset(df, select=c(name:var1, varN, var2:varN-1))

It can also be written as :

df <- subset(df, select=c(name, var1, varN, var2, var3,....,varN-1))

You can use columns names as well as column numbers for passing the new order of columns.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM