简体   繁体   English

在r的数据框中找到每行的第一,第二和第三最大值及其对应的列名

[英]Find first, second, and third maximum of each row and their corresponding column names in a data frame in r

I am trying to find first max, second max, and third max value and corresponding col names for each row, but unable to do that in r. 我正在尝试为每一行查找第一最大值,第二最大值和第三最大值以及对应的col名称,但是无法在r中做到这一点。 Please help. 请帮忙。

Here is how the dataframe looks like: 数据框如下所示:

              X1    X2    X3   X4    X5   X6   X7    X8    X9    X10   X11  X12   
      10003   0.0   0.0   0.0  0.0   0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0       
      10006   0.0   0.0   0.0  0.0   0.0  0.0 16.7   0.0   0.0   0.0   0.0  0.0       
      10007   0.0   0.0   0.0  0.0   0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0       
      10008   0.0   0.0   0.0  0.0   0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0       
      10010   0.0   0.0   0.0  0.0   0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0       
      10014   0.0   0.0   0.0  0.0   0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0   

This is the sample data you posted in your comment: 这是您在评论中发布的样本数据:

data <-read.table(text="       x1    x2    x3     x4    x5    x6   x7   x8    x9
                        1003    0  45.7     0   22.9     0  13.7    0    0  23.1 
                        1004 22.2     0  13.2      0   5.4     0  9.7    0     0 
                        1005    0     0     0     12   2.1     0    0  3.2     0  
                        1006  1.2     0   1.2      0  43.9  43.9    0    0  57.6",
                    header=T)

You can use dplyr and tidyverse to acheive this. 您可以使用dplyrtidyverse来实现。


The following code will give you the maximum three columns across all the rows: 以下代码将为您提供所有行中最多三列的信息:

library(dplyr)
library(tidyverse)

data %>% 
  rownames_to_column() %>%
  gather(column, value, -rowname) %>%
  group_by(rowname) %>% 
  arrange(desc(value)) %>% 
  head(3) 

This will give you the following result: 这将为您提供以下结果:

# A tibble: 3 x 3
# Groups:   rowname [3]
#   rowname column value
#   <chr>   <chr>  <dbl>
# 1 1006    x9      57.6
# 2 1003    x2      45.7
# 3 1006    x5      43.9

If you want to get the maximum three values for each row, you can do it as follows: 如果要获取每一行的最大三个值,可以按以下步骤进行操作:

result <- data %>% 
  rownames_to_column() %>%
  gather(column, value, -rowname) %>%
  group_by(rowname) %>% 
  mutate(max = rank(-value)) %>%
  filter(max <= 3) %>% 
  arrange(rowname, max)

Which will give you the following result: 这将为您带来以下结果:

# A tibble: 12 x 4
# Groups:   rowname [4]
#    rowname column value   max
#    <chr>   <chr>  <dbl> <dbl>
#  1 1003    x2      45.7   1  
#  2 1003    x9      23.1   2  
#  3 1003    x4      22.9   3  
#  4 1004    x1      22.2   1  
#  5 1004    x3      13.2   2  
#  6 1004    x7       9.7   3  
#  7 1005    x4      12     1  
#  8 1005    x8       3.2   2  
#  9 1005    x5       2.1   3  
# 10 1006    x9      57.6   1  
# 11 1006    x5      43.9   2.5
# 12 1006    x6      43.9   2.5

To summarize the result for each row, use the following code: 要总结每一行的结果,请使用以下代码:

result %>% 
  mutate(result = paste0(column, "=", value, collapse = ", ")) %>% 
  select(result) %>% 
  distinct()

Which will give you the following result: 这将为您带来以下结果:

# A tibble: 4 x 2
# Groups:   rowname [4]
#   rowname result                   
#   <chr>   <chr>                    
# 1 1003    x2=45.7, x9=23.1, x4=22.9
# 2 1004    x1=22.2, x3=13.2, x7=9.7 
# 3 1005    x4=12, x8=3.2, x5=2.1    
# 4 1006    x9=57.6, x5=43.9, x6=43.9


Hope it helps. 希望能帮助到你。

Here is my approach: 这是我的方法:

 # Make up data because yours is pretty unreproducible:
 df <- data.frame(X1=1:5, X2=c(3,5,1,6,7))

 # combine and sort the data by decreasing value:
 a <- sort(dplyr::combine(df), decreasing = T)[1:3]

 # For loop to get the indexes:
 for(i in 1:length(a)){
    print(which(df==a[i], arr.ind = T))
 }

This will give you what you need. 这将为您提供所需的东西。 Replace print with whatever you want to do (eg assign or whatever you need) print内容替换为您想要的任何内容(例如分配或所需的任何内容)

You can use 您可以使用

max.names = apply(data, 1, function(x) names(sort(x, decreasing = T)[1:3]))
max.vals = apply(data, 1, function(x) sort(x, decreasing = T)[1:3])
data = cbind(data, t(max.names), t(max.vals))
#        x1   x2   x3   x4   x5   x6  x7  x8   x9  1  2  3    1    2    3
# 1003  0.0 45.7  0.0 22.9  0.0 13.7 0.0 0.0 23.1 x2 x9 x4 45.7 23.1 22.9
# 1004 22.2  0.0 13.2  0.0  5.4  0.0 9.7 0.0  0.0 x1 x3 x7 22.2 13.2  9.7
# 1005  0.0  0.0  0.0 12.0  2.1  0.0 0.0 3.2  0.0 x4 x8 x5 12.0  3.2  2.1
# 1006  1.2  0.0  1.2  0.0 43.9 43.9 0.0 0.0 57.6 x9 x5 x6 57.6 43.9 43.9

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 查找与R中每一行的最小值对应的列名 - find column names corresponding to least value of each row in R 在r中的数据帧中找到每一行的最大nchar - find the maximum nchar for each row in a data frame in r R as.data.frame.matrix将第一列变成行名 - R as.data.frame.matrix turns first column into row names 将列名转换为 R 中数据框的第一行 - Convert Column names into first row of Data frame in R 检索R中列的最大值和第二最大值的行名 - Retrieve row names of maximum and second maximum values of a column in R 数据框中的前 N ​​列及其对应的列名和行名 - Top N columns in a data frame with their corresponding column names and row names 如何将任何数据框重塑为 2 列数据框,第一个是(重复的)列名,第二个是相应的值? - How to reshape any data-frame to a 2-columns data-frame, with the (repeated) column names in the first and the corresponding values in the second? 按数据表行的第一、第二、第三个最大名称 - first, second, third maximum name by row of data table 如何在R中的数据帧中找到一列中出现字符串最长的时间以及另一列中对应的第一个和最后一个值? - How to find the longest occurrence of a string in a column and corresponding first and last values from another column in a data frame in R? R-按行查找第一,第二和第三最大值 - R - find first, second and third largest values by row
相关标签
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM