简体   繁体   中英

Reordering column names

I have a similar problem in two scenarios.

Scenario 1: dataframe with identical column names with two groups with no particular order. ALL|ALL|AML|ALL|AML|AML|AML|ALL

Scenario 2: dataframe column names with numeric suffixes. ALL, ALL.1, ALL.2, AML.1, AML.2, ...this has double digit numbers too. If I order this in ascending order, it becomes ALL.1, ALL.10, ALL.11

I wish to group all ALLs first and the followed by AMLs. How can I achieve this in both scenarios?

One way to approach this,

y <- c('ALL', 'ALL.1', 'ALL.2', 'AML.1', 'AML.2', 'ALL.10')

y[order(gsub('\\.\\d+', '', y))]
#[1] "ALL"    "ALL.1"  "ALL.2"  "ALL.10" "AML.1"  "AML.2" 

#or to use it in a data frame,
df[,order(gsub('\\.\\d+', '', names(df))))]

Additionally you can use mixedorder from gtools package but you will have to replace the . from the suffix so it won't treat it as decimal (meaning .10 < .2 and not 10 > 2), ie

library(gtools)

#with the . in suffix
mixedsort(y)
#[1] "ALL.1"  "ALL.10" "ALL.2"  "ALL"    "AML.1"  "AML.2" 

#without the . in suffix
mixedsort(gsub('\\.', '_', y))
#[1] "ALL"    "ALL_1"  "ALL_2"  "ALL_10" "AML_1"  "AML_2" 

#or use it on the data frame
df[,mixedorder(gsub('\\.', '_', names (df)]

As for your first case, I agree with @alistaire that names NEED to be unique. Use make.unique and follow the method above

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM