在r的数据框中找到每行的第一，第二和第三最大值及其对应的列名

Question

I am trying to find first max, second max, and third max value and corresponding col names for each row, but unable to do that in r. 我正在尝试为每一行查找第一最大值，第二最大值和第三最大值以及对应的col名称，但是无法在r中做到这一点。 Please help. 请帮忙。

Here is how the dataframe looks like: 数据框如下所示：

              X1    X2    X3   X4    X5   X6   X7    X8    X9    X10   X11  X12   
      10003   0.0   0.0   0.0  0.0   0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0       
      10006   0.0   0.0   0.0  0.0   0.0  0.0 16.7   0.0   0.0   0.0   0.0  0.0       
      10007   0.0   0.0   0.0  0.0   0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0       
      10008   0.0   0.0   0.0  0.0   0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0       
      10010   0.0   0.0   0.0  0.0   0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0       
      10014   0.0   0.0   0.0  0.0   0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0

Answer 1

This is the sample data you posted in your comment: 这是您在评论中发布的样本数据：

data <-read.table(text="       x1    x2    x3     x4    x5    x6   x7   x8    x9
                        1003    0  45.7     0   22.9     0  13.7    0    0  23.1 
                        1004 22.2     0  13.2      0   5.4     0  9.7    0     0 
                        1005    0     0     0     12   2.1     0    0  3.2     0  
                        1006  1.2     0   1.2      0  43.9  43.9    0    0  57.6",
                    header=T)

You can use dplyr and tidyverse to acheive this. 您可以使用dplyr和tidyverse来实现。

The following code will give you the maximum three columns across all the rows: 以下代码将为您提供所有行中最多三列的信息：

library(dplyr)
library(tidyverse)

data %>% 
  rownames_to_column() %>%
  gather(column, value, -rowname) %>%
  group_by(rowname) %>% 
  arrange(desc(value)) %>% 
  head(3)

This will give you the following result: 这将为您提供以下结果：

# A tibble: 3 x 3
# Groups:   rowname [3]
#   rowname column value
#   <chr>   <chr>  <dbl>
# 1 1006    x9      57.6
# 2 1003    x2      45.7
# 3 1006    x5      43.9

If you want to get the maximum three values for each row, you can do it as follows: 如果要获取每一行的最大三个值，可以按以下步骤进行操作：

result <- data %>% 
  rownames_to_column() %>%
  gather(column, value, -rowname) %>%
  group_by(rowname) %>% 
  mutate(max = rank(-value)) %>%
  filter(max <= 3) %>% 
  arrange(rowname, max)

Which will give you the following result: 这将为您带来以下结果：

# A tibble: 12 x 4
# Groups:   rowname [4]
#    rowname column value   max
#    <chr>   <chr>  <dbl> <dbl>
#  1 1003    x2      45.7   1  
#  2 1003    x9      23.1   2  
#  3 1003    x4      22.9   3  
#  4 1004    x1      22.2   1  
#  5 1004    x3      13.2   2  
#  6 1004    x7       9.7   3  
#  7 1005    x4      12     1  
#  8 1005    x8       3.2   2  
#  9 1005    x5       2.1   3  
# 10 1006    x9      57.6   1  
# 11 1006    x5      43.9   2.5
# 12 1006    x6      43.9   2.5

To summarize the result for each row, use the following code: 要总结每一行的结果，请使用以下代码：

result %>% 
  mutate(result = paste0(column, "=", value, collapse = ", ")) %>% 
  select(result) %>% 
  distinct()

Which will give you the following result: 这将为您带来以下结果：

# A tibble: 4 x 2
# Groups:   rowname [4]
#   rowname result                   
#   <chr>   <chr>                    
# 1 1003    x2=45.7, x9=23.1, x4=22.9
# 2 1004    x1=22.2, x3=13.2, x7=9.7 
# 3 1005    x4=12, x8=3.2, x5=2.1    
# 4 1006    x9=57.6, x5=43.9, x6=43.9

Hope it helps. 希望能帮助到你。

Answer 2

Here is my approach: 这是我的方法：

 # Make up data because yours is pretty unreproducible:
 df <- data.frame(X1=1:5, X2=c(3,5,1,6,7))

 # combine and sort the data by decreasing value:
 a <- sort(dplyr::combine(df), decreasing = T)[1:3]

 # For loop to get the indexes:
 for(i in 1:length(a)){
    print(which(df==a[i], arr.ind = T))
 }

This will give you what you need. 这将为您提供所需的东西。 Replace print with whatever you want to do (eg assign or whatever you need) 将print内容替换为您想要的任何内容（例如分配或所需的任何内容）

Answer 3

You can use 您可以使用

max.names = apply(data, 1, function(x) names(sort(x, decreasing = T)[1:3]))
max.vals = apply(data, 1, function(x) sort(x, decreasing = T)[1:3])
data = cbind(data, t(max.names), t(max.vals))
#        x1   x2   x3   x4   x5   x6  x7  x8   x9  1  2  3    1    2    3
# 1003  0.0 45.7  0.0 22.9  0.0 13.7 0.0 0.0 23.1 x2 x9 x4 45.7 23.1 22.9
# 1004 22.2  0.0 13.2  0.0  5.4  0.0 9.7 0.0  0.0 x1 x3 x7 22.2 13.2  9.7
# 1005  0.0  0.0  0.0 12.0  2.1  0.0 0.0 3.2  0.0 x4 x8 x5 12.0  3.2  2.1
# 1006  1.2  0.0  1.2  0.0 43.9 43.9 0.0 0.0 57.6 x9 x5 x6 57.6 43.9 43.9

在r的数据框中找到每行的第一，第二和第三最大值及其对应的列名

问题描述

3 个解决方案

解决方案1
1 已采纳 2019-01-02 02:12:37

解决方案2
0 2019-01-02 02:09:09

解决方案3
0 2019-01-02 02:49:43

在r的数据框中找到每行的第一，第二和第三最大值及其对应的列名

问题描述

3 个解决方案

解决方案1 1 已采纳 2019-01-02 02:12:37

解决方案2 0 2019-01-02 02:09:09

解决方案3 0 2019-01-02 02:49:43

解决方案1
1 已采纳 2019-01-02 02:12:37

解决方案2
0 2019-01-02 02:09:09

解决方案3
0 2019-01-02 02:49:43