[英]R: returning the 5 rows with the highest values
Sample data 样本数据
mysample <- data.frame(ID = 1:100, kWh = rnorm(100))
I'm trying to automate the process of returning the rows in a data frame that contain the 5 highest values in a certain column. 我正在尝试自动化返回包含特定列中5个最高值的数据框中的行的过程。 In the sample data, the 5 highest values in the "kWh" column can be found using the code:
在示例数据中,可以使用以下代码找到“kWh”列中的5个最高值:
(tail(sort(mysample$kWh), 5))
which in my case returns: 在我的情况下返回:
[1] 1.477391 1.765312 1.778396 2.686136 2.710494
I would like to create a table that contains rows that contain these numbers in column 2. I am attempting to use this code: 我想创建一个包含第2列中包含这些数字的行的表。我正在尝试使用此代码:
mysample[mysample$kWh == (tail(sort(mysample$kWh), 5)),]
This returns: 返回:
ID kWh
87 87 1.765312
I would like it to return the r rows that contain the figures above in the "kWh" column. 我希望它能在“kWh”列中返回包含上图中的r行。 I'm sure I've missed something basic but I can't figure it out.
我确定我错过了一些基本的东西,但我无法理解。
We can use rank
我们可以使用
rank
mysample$Rank <- rank(-mysample$kWh)
head(mysample[order(mysample$Rank),],5)
if we don't need to create column, directly use order
(as @Jaap mentioned in three alternative methods) 如果我们不需要创建列,直接使用
order
(如在三种替代方法中提到的@Jaap)
#order descending and get the first 5 rows
head(mysample[order(-mysample$kWh),],5)
#order ascending and get the last 5 rows
tail(mysample[order(mysample$kWh),],5)
#or just use sequence as index to get the rows.
mysample[order(-mysample$kWh),][1:5]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.