找到范围内的最大值，然后返回该行的列名

Question

I have data.frame with column names start with prefix of X and series of numbers.我有列名以 X 前缀和一系列数字开头的data.frame 。 For example,例如，

col<-c("X1.1","X1.2","X1.3","X1.4","X1.5","X2.1","X2.2","X2.3","X2.4","X2.5","X3.1","X3.2","X3.3","X3.4","X3.5")
m<-matrix(sample(1:15),ncol=15,nrow=5)
mf<-data.frame(m)
colnames(mf)<-col

Then I want to find the max values for each row within prefix of X1 (total four columns), X2 (four columns), X3 (four columns)...and return the column number (subsequent number after the X prefix) for the max value然后我想在 X1（总共四列）、X2（四列）、X3（四列）的前缀内找到每一行的最大值......并返回列号（X 前缀之后的后续数字）最大值

So my expected output is所以我的预期输出是

    X1  X2  X3  X4
1    4   2   4  ...
...

Can anyone help me on this?谁可以帮我这个事？ And if there's two max values then want to return two column names as well...如果有两个最大值，那么还想返回两个列名......

I searched that which should be used.. but not sure.我搜索过which应该使用..但不能肯定。

Answer 1

Recreate example data (please use reproduce or dput in the future):重新创建示例数据（请使用reproduce或dput在未来）：

df = data.frame(matrix(rep(NA,12*3),nrow=3))
colnames(df) = strsplit("X1.1 X1.2 X.3 X.4 X2.1 X2.2 X2.3 X2.4 X3.1 X3.2 X3.3 X3.4",split=" ")[[1]]
sapply(colnames(df), function(x) { df[[x]] <<- sample(1:10,3) } )

Get the different kinds of colnames:获取不同类型的 colnames：

xTypes = unique(sapply(colnames(df), function(x) { strsplit(x,"\\.")[[1]][1] } ))

Get the max per colname kind:获取每个 colname 种类的最大值：

result = sapply(xTypes,function(x) { max(df[,grep(paste(x,"\\.",sep=""),colnames(df))])  })

> sapply(xTypes,function(x) { max(df[,grep(paste(x,"\\.",sep=""),colnames(df))])  })
X1  X X2 X3 
 9  9 10  9

If you want the column index of the maximum within each colname kind:如果您想要每个 colname 种类中最大值的列索引：

result = sapply(xTypes,function(x) { which.max(apply(df[,grep(paste(x,"\\.",sep=""),colnames(df))],2,max))  })
names(result) = xTypes

Now the result is:现在的结果是：

X1  X X2 X3 
 1  1  2  1

Answer 2

To reshape your data use the following:要重塑您的数据，请使用以下方法：

library(reshape2)
mf.melted <- melt(data=mf)
mf.melted$group <- unlist(gsub("\\.\\d+$", "", as.character(mf.melted$variable)))
mf.melted

Disection of this line: `unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))`这一行的`unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))` ： `unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))`

## Original column names are now stored as column `'variable'` in `mf.melted`
mf.melted$variable

## Notice it is a `factor` column. So needs to be converted to string. This is done with:
as.character(  __  )

## Next we remove the `.3` (or whatever number) from each.
## the regex expression '\\.\\d+$' looks for 
`\\.`  # a period
`\\d`  # a digit
'\\d+' # at least one digit
`$`    # at the end of a word

## gsub  finds the first pattern and replaces it with the second
## in this case an empty string
gsub("\\.\\d+$", "",  __ )

## We then assign the results back into a new column, namely `'group'`
mf.melted$group <-   __

Now, with your melted data.frame, you can easily search and aggregate by column group现在，使用融化的 data.frame，您可以轻松地按列组进行搜索和聚合

head(mf.melted)
  variable value group
1     X1.1     3    X1
2     X1.1     4    X1
3     X1.1    12    X1
4     X1.1    14    X1
5     X1.1     7    X1
6     X1.2     6    X1

找到范围内的最大值，然后返回该行的列名

问题描述

2 个解决方案

解决方案1
3 已采纳 2014-01-29 03:33:59

解决方案2
2 2014-01-29 06:40:50

Disection of this line: `unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))`这一行的`unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))` ： `unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))`

找到范围内的最大值，然后返回该行的列名

问题描述

2 个解决方案

解决方案1 3 已采纳 2014-01-29 03:33:59

解决方案2 2 2014-01-29 06:40:50

Disection of this line: unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))这一行的unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable))) ： unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))

解决方案1
3 已采纳 2014-01-29 03:33:59

解决方案2
2 2014-01-29 06:40:50

Disection of this line: `unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))`这一行的`unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))` ： `unlist(gsub("\\\\.\\\\d+$", "", as.character(mf.melted$variable)))`