[英]Conditional replacement of column values in a dataframe using R
Let's make a dummy dataset 让我们制作一个虚拟数据集
ll = data.frame(rbind(c(2,3,5), c(3,4,6), c(9,4,9)))
colnames(ll)<-c("b", "c", "a")
> ll
b c a
1 2 3 5
2 3 4 6
3 9 4 9
P = data.frame(cbind(c(3,5), c(4,6), c(8,7)))
colnames(P)<-c("a", "b", "c")
> P
a b c
1 3 4 8
2 5 6 7
I want to create a new dataframe where the values in each column of ll would be turned into 0 when it is less than corresponding values of a,b, & c in the first row of P; 我想创建一个新的数据帧,当它小于P第一行中的a,b和c的对应值时,ll的每一列中的值都将变为0; in other words, I'd like to see 换句话说,我想看看
> new_ll
b c a
1 0 0 5
2 0 0 6
3 9 0 9
so I tried it this way 所以我这样尝试了
nn=c("a", "b", "c")
new_ll = sapply(nn, function(i)
ll[,paste0(i)][ll[,paste0(i)] < P[,paste0(i)][1]] <- 0)
But it doesn't work for some reason! 但是由于某种原因,它不起作用! I must be doing a silly mistake in my script!! 我一定在脚本中犯了一个愚蠢的错误! Any idea? 任何想法?
> new_ll
a b c
0 0 0
You can find the values in ll
that are smaller than the first row of P
with an apply
: 您可以使用apply
在ll
中找到小于P
第一行的值:
t(apply(ll, 1, function(x) x<P[1,][colnames(ll)]))
[,1] [,2] [,3]
[1,] TRUE TRUE FALSE
[2,] TRUE TRUE FALSE
[3,] FALSE TRUE FALSE
Here, the first row of P
is ordered to match ll
, then the elements are compared. 这里, P
的第一行被排序为匹配ll
,然后比较元素。
Credit to Ananda Mahto for recognizing that apply
is not required: 感谢Ananda Mahto承认不需要apply
:
ll < c(P[1, names(ll)])
b c a
[1,] TRUE TRUE FALSE
[2,] TRUE TRUE FALSE
[3,] FALSE TRUE FALSE
The TRUE
values show where you want to substitute with 0: TRUE
值显示要替换为0的位置:
ll[ ll < c(P[1, names(ll)]) ] <- 0
ll
b c a
1 0 0 5
2 0 0 6
3 9 0 9
To fix your code, you want something like this: 要修复您的代码,您需要以下代码:
do.call(cbind, lapply(names(ll), function(i) {
ll[,i][ll[,i] < P[,i][1]] <- 0
return(ll[i])}))
b c a
1 0 0 5
2 0 0 6
3 9 0 9
What's changed? 有什么变化? First, sapply
is changed to lapply
and the function returns a vector for each iteration. 首先,将sapply
更改为lapply
,该函数为每次迭代返回一个向量。 Second, the names are presented in the correct order for the expected results. 其次,以正确的顺序显示名称以达到预期的结果。 Third, the results are put together with cbind
to get the final matrix. 第三,将结果与cbind
放在一起以获得最终矩阵。 As a bonus, the redundant calls to paste0
have been removed. 另外,删除了对paste0
的多余调用。
You could also try mapply
, which applies the function to the each corresponding element. 您也可以尝试mapply
,它将功能应用于每个对应的元素。 Here, the ll
and P
are both data.frames
. 在这里, ll
和P
都是data.frames
。 So, it applies the function for each column and does the recycling also. 因此,它将功能应用于每个列并也进行回收。 Here, I matched the column names
of P
with that of ll
(similar to @Matthew Lundberg) and looked for which elements of ll
in each column is <
than the corresponding column (the one row of P
gets recycled) and returns a logical index. 在这里,我将P
的column names
与ll
的column names
进行了匹配(类似于@Matthew Lundberg),并查找每列中ll
哪些元素<
小于对应的列( P
一行被回收)并返回逻辑索引。 Then the elements that matches the logical condition are assigned to 0
. 然后,将符合逻辑条件的元素分配给0
。
indx <- mapply(`<`, ll, P[1,][names(ll)])
new_ll <- ll
new_ll[indx] <- 0
new_ll
# b c a
#1 0 0 5
#2 0 0 6
#3 9 0 9
In case you know that ll
and P
are numeric you can do it also as 如果您知道ll
和P
是数字,则也可以执行以下操作
llm <- as.matrix(ll)
pv <- as.numeric(P[1, colnames(llm)])
llm[sweep(llm, 2, pv, `<=`)] <- 0
data.frame(llm)
# b c a
# 1 0 0 5
# 2 0 0 6
# 3 9 0 9
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.