[英]Quick manipulation of data frame in R
I have the following example data frame: 我有以下示例数据框:
> a = data.frame(a=c(1, 2, 3), b=c(10, 11, 12), c=c(1, 1, 0))
> a
a b c
1 1 10 1
2 2 11 1
3 3 12 0
I want to do an operation to every row where if a$c == 1
, a$a = a$b
, otherwise, a$a
keeps its value. 我想对每行执行一个操作,如果
a$c == 1
, a$a = a$b
,否则a$a
保持其值。 The final data frame a should look like this: 最终数据帧a应该如下所示:
> a
a b c
1 10 10 1
2 11 11 1
3 3 12 0
What is the fastest way to do this? 最快的方法是什么? Of course in my problem I have hundreds of thousands of rows, so looping over the entire data frame and doing one by one is extremely slow.
当然,在我的问题中,我有成千上万的行,因此遍历整个数据帧并一一进行非常慢。
Thanks! 谢谢!
Easy as 1-2-3: 容易如1-2-3:
df = data.frame(a=c(1, 2, 3), b=c(10, 11, 12), c=c(1, 1, 0))
df$a[df$c == 1] <- df$b[df$c == 1]
df
## a b c
## 1 10 10 1
## 2 11 11 1
## 3 3 12 0
It reads: substitute all the elements in a
corresponding to c==1
with all the elements in b
corresponding to c==1
. 它的读法是: 用
b
对应于c==1
所有元素替换a
对应于c==1
的所有元素 。
A benchmark: 基准:
df <- data.frame(a=runif(100000), b=runif(100000), c=sample(c(1,0), 100000, replace=TRUE))
library(microbenchmark)
microbenchmark(df$a[df$c == 1] <- df$b[df$c == 1], df$a <- with(df, ifelse(c == 1, b, a)))
## Unit: milliseconds
## expr min lq median uq max neval
## df$a[df$c == 1] <- df$b[df$c == 1] 13.85375 15.13073 16.61701 74.5387 88.47949 100
## df$a <- with(df, ifelse(c == 1, b, a)) 44.23750 78.85029 103.01894 105.1750 118.09492 100
a$a <- with(a, ifelse(c == 1, b, a))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.