[英]How to order data frame by numeric vector and character vector
I would like to order a data frame by a numeric vector and a character vector so that I can remove duplicates in the Code column, retaining the records with the highest value in the Value column.我想通过数字向量和字符向量对数据框进行排序,以便我可以删除代码列中的重复项,保留值列中具有最高值的记录。 However, if my Category column has a "YS" or "YS1", then I want to retain those records even if the Value isn't the highest number
但是,如果我的类别列有“YS”或“YS1”,那么即使值不是最高数字,我也想保留这些记录
Here's a sample data set:这是一个示例数据集:
Code <- c(2,2,3,5,3,7,8)
Value <- c(17,18,35,25,67,34,2)
Category <- c("YS", "DW", "YS1", "OS", "OS", "OS1", "GD")
Dataset <- data.frame(Code, Value, Category)
Code Value Category
1 2 17 YS
2 2 18 DW
3 3 35 YS1
4 5 25 OS
5 3 67 OS
6 7 34 OS1
7 8 2 GD
When I order the data by Code (ascending) and Value (descending) and remove the duplicate records by Code, my "YS" record for Code = 2 is not retained because it has a lower Value.当我按代码(升序)和值(降序)对数据进行排序并按代码删除重复记录时,我的代码 = 2 的“YS”记录不会保留,因为它的值较低。
order_data <- Dataset[order(Dataset$Code, -Dataset$Value),]
dataset_nodup <- order_data[!duplicated(order_data$Code),]
Code Value Category
2 2 18 DW
5 3 67 OS
4 5 25 OS
6 7 34 OS1
7 8 2 GD
I'd like to first order by my Category column and then my Value column so that my "YS" and "YS1" records are listed first.我想先按类别列排序,然后按值列排序,以便首先列出我的“YS”和“YS1”记录。 I have tried the following but it is not working.
我尝试了以下方法,但它不起作用。
order_data <- Dataset[order(Dataset$Code, -Dataset$Category, -Dataset$Value),]
I would like my output to look like:我希望我的 output 看起来像:
Code Value Category
1 2 17 YS
2 3 67 YS1
3 5 25 OS
4 7 34 OS1
5 8 2 GD
We can use match
to bring Category
with "YS"
and "YS1"
ahead and then remove duplicates我们可以使用
match
将带有"YS"
和"YS1"
的Category
放在前面,然后删除重复项
order_data <- Dataset[with(Dataset, order(match(Category, c("YS", "YS1")),
Code, -Value)),]
dataset_nodup <- order_data[!duplicated(order_data$Code),]
dataset_nodup
# Code Value Category
#1 2 17 YS
#3 3 35 YS1
#4 5 25 OS
#6 7 34 OS1
#7 8 2 GD
Or using dplyr
或使用
dplyr
library(dplyr)
Dataset %>%
arrange(match(Category, c("YS", "YS1")), Code, desc(Value)) %>%
filter(!duplicated(Code))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.