[英]R - work on data frame rows based on condition
I'm trying to understand how can I work on the rows of a data frame based on a condition. 我试图了解如何根据条件处理数据帧的行。 Having a data frame like this
拥有这样的数据框架
> d<-data.frame(x=c(0,1,2,3), y=c(1,1,1,0))
> d
x y
1 0 1
2 1 1
3 2 1
4 3 0
how can I add +1 to all rows that contain a value of zero? 如何为包含零值的所有行添加+1? (note that zeros can be found in any column), so that the result would look like this:
(请注意,可以在任何列中找到零),因此结果将如下所示:
x y
1 1 2
2 1 1
3 2 1
4 4 1
The following code seems to do part of the job, but is just printing the rows where the action was taken, the number of times it was taken (2)... 以下代码似乎可以完成部分工作,但只是打印采取操作的行,采取的次数(2)...
> for(i in 1:nrow(d)){
+ d[d[i,]==0,]<-d[i,]+1
+ }
> d
x y
1 1 2
2 4 1
3 1 2
4 4 1
I'm sure there is a simple solution for this, maybe an apply function?, but I'm not getting there. 我确定有一个简单的解决方案,也许是一个应用函数?,但我没有到达那里。
Thanks. 谢谢。
Some possibilities: 一些可能性:
# 1
idx <- which(d == 0, arr.ind = TRUE)[, 1]
d[idx, ] <- d[idx, ] + 1
# 2
t(apply(d, 1, function(x) x + any(x == 0)))
# 3
d + apply(d == 0, 1, max)
The usage of which
for vectors, eg which(1:3 > 2)
, is quite common, whereas it is used less for matrices: by specifying arr.ind = TRUE
what we get is array indices, ie coordinates of every 0: 的用法
which
为载体,例如which(1:3 > 2)
是相当普遍的,而它是为基质中使用以下:通过指定arr.ind = TRUE
我们得到的是数组索引,即每0坐标:
which(d == 0, arr.ind = TRUE)
row col
[1,] 1 1
[2,] 4 2
Since we are interested only in rows where zeros occur, I take the first column of which(d == 0, arr.ind = TRUE)
and add 1 to all the elements in these rows by d[idx, ] <- d[idx, ] + 1
. 由于我们只对出现零的行感兴趣,因此我取第一列
which(d == 0, arr.ind = TRUE)
并将d[idx, ] <- d[idx, ] + 1
加到这些行中的所有元素中d[idx, ] <- d[idx, ] + 1
。
Regarding the second approach, apply(d, 1, function(x) x)
would be simply going row by row and returning the same row without any modifications. 关于第二种方法,
apply(d, 1, function(x) x)
将简单地逐行进行并返回相同的行而不进行任何修改。 By any(x == 0)
we check whether there are any zeros in a particular row and get TRUE
or FALSE
. 通过
any(x == 0)
我们检查特定行中是否有任何零并获得TRUE
或FALSE
。 However, by writing x + any(x == 0)
we transform TRUE
or FALSE
to 1 or 0, respectively, as required. 但是,通过写入
x + any(x == 0)
我们可以根据需要将TRUE
或FALSE
分别转换为1或0。
Now the third approach. 现在是第三种方法。
d == 0
is a logical matrix, and we use apply
to go over its rows. d == 0
是一个逻辑矩阵,我们使用apply
来遍历它的行。 Then when applying max
to a particular row, we again transform TRUE
, FALSE
to 1, 0 and find a maximal element. 然后,当将
max
应用于特定行时,我们再次将TRUE
, FALSE
转换为1,0并找到最大元素。 This element is 1 if and only if there are any zeros in that row. 当且仅当该行中有任何零时,此元素为1。 Hence,
apply(d == 0, 1, max)
returns a vector of zeros and ones. 因此,
apply(d == 0, 1, max)
返回0和1的向量。 The final point is that when we write A + b
, where A
is a matrix and b
is a vector, the addition is column-wise. 最后一点是当我们写
A + b
,其中A
是矩阵而b
是矢量,加法是逐列的。 In this way, by writing d + apply(d == 0, 1, max)
we add apply(d == 0, 1, max)
to every column of d
, as needed. 这样,通过写
d + apply(d == 0, 1, max)
我们根据需要在d
每一列添加apply(d == 0, 1, max)
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.