简体   繁体   English

R - 选择连续出现的数字

[英]R - choose the number which appears most in a row

I have a df test : 我有一个df test

A   B   C
1   1   NA
2   NA  NA
1   2   2

I want to create a another column, say test$D , which is the number that appears most in that row, excluding NA. 我想创建另一个列,比如test$D ,这是该行中出现最多的数字,不包括NA。 My desired df is: 我想要的df是:

A   B   C   D
1   1   NA  1
2   NA  NA  2
1   2   2   2

I have been looking for a similar function like rowMeans with na.rm=T but could not find any appropriate function for this situation. 我一直在寻找像rowMeans这样的类似函数,na.rm = T但是找不到适合这种情况的函数。 Really appreciate any help 真的很感激任何帮助

Another option using table , 使用table另一种选择,

apply(test, 1, function(i) as.numeric(names(sort(-table(i)))[1]))
#[1] 1 2 2

We can use apply with MARGIN = 1 to find the frequency of numbers in each row and get the maximum frequency number using which.max 我们可以使用applyMARGIN = 1找到每一行的数字的频率,并使用获得的最大频率数which.max

test$D <- apply(test, 1, FUN = function(x) {
        x1 <- table(factor(x, levels = unique(x)))
          as.numeric(names(x1)[which.max(x1)])})
test$D
#[1] 1 2 2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM