根据其他列的结果向数据框添加新列

Question

I'm very new to R so I hope my question will be interesting. 我对R很新，所以我希望我的问题会很有趣。 What I want to do is quite straightforward. 我想做的事情非常简单。 Here's a sample of my dataset: 这是我的数据集的示例：

> head(belongliness)
   ACTIVITY_X ACTIVITY_Y ACTIVITY_Z   Event  cluster1    cluster2     cluster3    cluster4
1:         40         47         62 Head-up 0.1900989 0.768225365 0.0160654667 0.025610279
2:         60         74         95 Head-up 0.5392218 0.038558310 0.0064671635 0.415752686
3:         62         63         88 Head-up 0.7953673 0.044981152 0.0067121719 0.152939414
4:         60         56         82 Head-up 0.9941016 0.002608879 0.0003007537 0.002988748
5:         66         61         90 Head-up 0.7027407 0.048318016 0.0079239680 0.241017291
6:         60         53         80 Head-up 0.9541378 0.023338896 0.0024442116 0.020079071

I would like to create a new column "winning cluster" to the right side of column "cluster 4" . 我想在列"cluster 4"的右侧创建一个新的"winning cluster"列。 Column "winning cluster" will take the highest value among columns "cluster 1" to "cluster 4" for each row and display the index name of that column. 列"winning cluster"将在每行的"cluster 1"列到"cluster 4"取最高值，并显示该列的索引名称。

For row 1 that will be cluster 2 , for row 2 cluster 1 , for row 3 cluster 1 etc. 对于行1，这将是cluster 2 ，对第2行cluster 1 ，对第3行cluster 1等

Any help is appreciated! 任何帮助表示赞赏！

Answer 1

If the dataset is a data.table class, specify the columns of interest in .SDcols , get the column index of highest value in each row with max.col , use that to select the column name and assign ( := ) as 'winning_cluster' 如果数据集是一个data.table类，指定的兴趣列.SDcols ，得到最高值的列索引，每行max.col ，用它来选择列名并分配（ := ）为“winning_cluster “

library(data.table)
belongliness[, winning_cluster := names(.SD)[max.col(.SD)], 
           .SDcols = cluster1:cluster4]

Answer 2

In basic R, this is easily done: 在基本的R中，这很容易做到：

belongliness$`winning cluster` = apply(belongliness[,5:8], 1, max)

where belongliness[,5:8] corresponds to columns cluster1 through cluster4 . 其中， belongliness[,5:8]对应于cluster1到cluster4列。

Or if you wanted the index, 或者如果你想要索引，

belongliness$`winning cluster` = apply(belongliness[,5:8], 1, which.max)
belongliness$`winning cluster` = paste0('cluster', belongliness$`winning cluster`)

Edit: the right hand side of the first line is essentially max.col : 编辑：第一行的右侧基本上是max.col ：

belongliness$`winning cluster` = max.col(belongliness[,5:8])

根据其他列的结果向数据框添加新列

问题描述

2 个解决方案

解决方案1
2 已采纳 2019-03-12 11:11:04

解决方案2
1 2019-03-12 11:34:45

根据其他列的结果向数据框添加新列

问题描述

2 个解决方案

解决方案1 2 已采纳 2019-03-12 11:11:04

解决方案2 1 2019-03-12 11:34:45

解决方案1
2 已采纳 2019-03-12 11:11:04

解决方案2
1 2019-03-12 11:34:45