[英]Replace outstanding values from the mean by NA
I would like to take a mean of each row from my data and find out how far from the mean is each value in the row. 我想从数据中获取每一行的平均值,并找出行中每个值与平均值之间的距离。 If the percentage is higher than 50 this value should be replaced with
NA
. 如果百分比高于50,则此值应替换为
NA
。
That's the data: 那就是数据:
structure(list(Name = structure(c(18L, 19L, 5L, 13L, 14L, 31L
), .Label = c("AMC Javelin", "Cadillac Fleetwood", "Camaro Z28",
"Chrysler Imperial", "Datsun 710", "Dodge Challenger", "Duster 360",
"Ferrari Dino", "Fiat 128", "Fiat X1-9", "Ford Pantera L", "Honda Civic",
"Hornet 4 Drive", "Hornet Sportabout", "Lincoln Continental",
"Lotus Europa", "Maserati Bora", "Mazda RX4", "Mazda RX4 Wag",
"Merc 230", "Merc 240D", "Merc 280", "Merc 280C", "Merc 450SE",
"Merc 450SL", "Merc 450SLC", "Pontiac Firebird", "Porsche 914-2",
"Toyota Corolla", "Toyota Corona", "Valiant", "Volvo 142E"), class = "factor"),
mpg_1 = c(125, 133, 143, 141, 134, 238), cyl_1 = c(114, 153,
112, 136, 128, 155), disp_1 = c(113, 143, 144, 131, 431,
331), hp_1 = c(332, 221, 113, 331, 134, 151)), .Names = c("Name",
"mpg_1", "cyl_1", "disp_1", "hp_1"), row.names = c(NA, 6L), class = "data.frame")
and that's the desired output: 这是所需的输出:
Name mpg_1 cyl_1 disp_1 hp_1
1 Mazda RX4 125 114 113 NA
2 Mazda RX4 Wag 133 153 143 221
3 Datsun 710 143 112 144 113
4 Hornet 4 Drive 141 136 131 NA
5 Hornet Sportabout 134 128 NA 134
6 Valiant 238 155 331 151
There are two conditions as well. 也有两个条件。
NA
. NA
。 It's hard to believe that using 50% cutoff there will be two values because the mean would change completely but look at the second condition. Do you have any idea how to do it in efficient way ? 您是否知道如何以有效的方式进行操作? Using a loop it looks doable but maybe there is more efficient way?
使用循环看起来可行,但是也许有更有效的方法?
From a statistical point view, as @Roland mentions in comments, this is not advised. 从统计角度看,正如@Roland在评论中提到的那样,不建议这样做。 But If you absolutely have to do it, then,
但是如果您绝对必须这样做,
fun1 <- function(x, n){
t <- which((x - mean(x))/mean(x) > n)[1]
x[t] <- NA
return(x)
}
df1[-1] <- t(apply(df1[-1], 1, fun1, 0.5))
df1
# Name mpg_1 cyl_1 disp_1 hp_1
#1 Mazda RX4 125 114 113 NA
#2 Mazda RX4 Wag 133 153 143 221
#3 Datsun 710 143 112 144 113
#4 Hornet 4 Drive 141 136 131 NA
#5 Hornet Sportabout 134 128 NA 134
#6 Valiant 238 155 NA 151
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.