简体   繁体   English

如何对 R 中元素包含数字和字母的字符向量进行排序?

[英]How to sort a character vector where elements contain numbers and letters in R?

My character array is the following: Note: the array was changed.我的字符数组如下:注意:数组已更改。 Sorry, my mistake抱歉,是我的错

m <- c("VI-2005","III-2005","II-2005","I-2005","III-2006","II-2006","I-2006","VI-2006","IV-2007","III-2007","II-2007","I-2007")

I have roman numbers and years.我有罗马数字和年份。 I would like to sort it in descending order so that I will have an output like this:我想按降序对它进行排序,这样我就有一个像这样的 output :

I-2005
II-2005
III-2005
IV-2005
I-2006
II-2006
III-2006
IV-2006
I-2007
II-2007
III-2007
IV-2007

I had try the mixedsort from the "gtools" package:我尝试了“ mixedsort ”package 中的混合排序:

> # install.packages("gtools") ## Uncomment if not already installed
> library(gtools)
> mixedsort(m)

But it doesn't sort by the roman numbers.但它不按罗马数字排序。 Thank you for reading!感谢您的阅读!

Split the input and transform the roman numerals into a factor:拆分输入并将罗马数字转换为一个因子:

library(data.table)
n <- tstrsplit(m, "-")
n[[1]] <- ordered(n[[1]], levels = c("I", "II", "III", "IV", 
                                    "V", "VI", "VII", "VIII", 
                                    "IX", "X", "XI", "XII"))
n[[2]] <- as.integer(n[[2]])
n <- rev(n)

m[do.call(order, n)]
#[1] "I-2005"   "I-2005"   "II-2005"  "III-2005" "VI-2005"  "VI-2005"  "II-2006"  "III-2006"
#[9] "I-2007"   "II-2007"  "III-2007" "IV-2007" 

Edit:编辑:

If you want to use this for plotting or modelling, turn m into an ordered factor:如果你想用它来绘图或建模,把m变成一个有序因子:

library(data.table)
n <- tstrsplit(unique(m), "-")
n[[1]] <- ordered(n[[1]], levels = c("I", "II", "III", "IV", 
                                    "V", "VI", "VII", "VIII", 
                                    "IX", "X", "XI", "XII"))
n[[2]] <- as.integer(n[[2]])
n <- rev(n)    

m <- ordered(m, levels = unique(m)[do.call(order, n)])

You can use as.roman to convert roman numerals to their type and then use order to order the dataframe.您可以使用as.roman将罗马数字转换为其类型,然后使用order订购 dataframe。

data <- strcapture('(\\w+)-(\\d+)', m, 
                   proto = list(roman = character(), number = numeric()))
data$roman <- as.roman(data$roman)
data <- data[do.call(order, rev(data)), ]

#   roman number
#4      I   2005
#7      I   2005
#3     II   2005
#2    III   2005
#1     VI   2005
#8     VI   2005
#6     II   2006
#5    III   2006
#12     I   2007
#11    II   2007
#10   III   2007
#9     IV   2007

If you want the string back, you can paste it:如果你想找回字符串,你可以粘贴它:

do.call(paste, c(data, sep = '-'))
# [1] "I-2005"   "I-2005"   "II-2005"  "III-2005" "VI-2005"  "VI-2005"  "II-2006" 
# [8] "III-2006" "I-2007"   "II-2007"  "III-2007" "IV-2007" 

You could also use the as.roman function.您也可以使用as.roman function。 This will ensure a number 6 ie VI is before 9 ie IX when sorting这将确保排序时数字 6 即VI在 9 即IX之前

s <- read.table(text=m, sep="-")
with(s, do.call(paste, c(sep="-", s[order(V2, as.roman(V1)), ])))

 [1] "I-2005"   "I-2005"   "II-2005"  "III-2005" "VI-2005"  "VI-2005"  "II-2006"  "III-2006"
 [9] "I-2007"   "II-2007"  "III-2007" "IV-2007" 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM