[英]sort character vector by multiple numbers
I have a sample character vector with file names like this: 我有一个示例字符向量,其文件名如下:
> vector
[1] "1 Janu 1998.txt" "2 Feb. 1999.txt" "3 Marc 1999.txt"
[4] "2 February 1998.txt" "3 March. 1998.txt" "1 Jan 1999.txt"
I would like to sort the elements by year and month(first number of each element). 我想按年和月(每个元素的第一个数字)对元素进行排序。 So I do this:
所以我这样做:
> library(gtools)
> mixedsort(vector)
[1] "1 Janu 1998.txt" "1 Jan 1999.txt" "2 February 1998.txt"
[4] "2 Feb. 1999.txt" "3 Marc 1999.txt" "3 March. 1998.txt"
If I use sort(vector) I get the same output. 如果我使用sort(vector),则会得到相同的输出。 I have been reading several questions, but I have not found a specific answer to this.
我已经阅读了几个问题,但没有找到具体的答案。 I would be grateful if someone could help me.
如果有人可以帮助我,我将不胜感激。 Thanks in advance.
提前致谢。 I would like to get the following output:
我想得到以下输出:
> output
[1] "1 Janu 1998.txt" "2 February 1998.txt" "3 March. 1998.txt"
[4] "1 Jan 1999.txt" "2 Feb. 1999.txt" "3 Marc 1999.txt"
We can do: 我们可以做的:
v <- c("1 Jan 1998.txt", "2 Feb. 1999.txt", "3 March 1999.txt", "2 Feb 1998.txt", "3 March. 1998.txt","1 Jan 1999.txt")
v[order(as.Date(gsub("\\.", "", v), "%d %b %Ytxt"))];
#[1] "1 Jan 1998.txt" "2 Feb 1998.txt" "3 March. 1998.txt"
#[4] "1 Jan 1999.txt" "2 Feb. 1999.txt" "3 March 1999.txt"
Explanation: We use as.Date
to convert entries in vector v
to dates; 说明:我们使用
as.Date
将向量v
条目转换为日期; order
will then properly order dates by day, month, year. 然后,
order
将按日,月,年正确地订购日期。
Note that some of your entries in vector v
contain a period after the month; 请注意,向量
v
中的某些条目在一个月后包含一个句点; not sure if this is by accident, but the gsub
command takes care of those. 不知道这是否是偶然的,但是
gsub
命令可以解决这些问题。
The same is achieved with: 通过以下方式也可以达到相同的目的
v[order(as.Date(gsub("(\\.|\\.txt)", "", v), "%d %b %Y"))];
To address non-standard abbreviations of month names, I would define a custom map
that links non-standard with standard names/abbreviations. 为了解决月份名称的非标准缩写,我将定义一个自定义
map
,将非标准名称与标准名称/缩写链接。 Then you can do something like this: 然后,您可以执行以下操作:
v <- c("1 Janu 1998.txt", "2 Feb. 1999.txt", "3 Marc 1999.txt",
"2 February 1998.txt", "3 March. 1998.txt", "1 Jan 1999.txt")
# Define a map to map non-standard to standard month abbrev
map <- c(
Janu = "Jan",
Marc = "March")
# Separate dmy from filename and store in matrix
mat <- sapply(gsub("(\\.|\\.txt)", "", v), function(x)
unlist(strsplit(x, " ")))
# Replace non-standard month names
mat[2, ] <- ifelse(
!is.na(match(mat[2, ], names(map))),
map[match(mat[2, ], names(map))],
mat[2, ])
# Convert to Date then to numeric
dmy <- as.numeric(apply(mat, 2, function(x)
as.Date(paste0(x, collapse = "-"), format = "%d-%b-%Y")));
# Order according to dmy
v[order(dmy)]
#[1] "1 Janu 1998.txt" "2 February 1998.txt" "3 March. 1998.txt"
#[4] "1 Jan 1999.txt" "2 Feb. 1999.txt" "3 Marc 1999.txt"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.