[英]How to mutate a new col in dataframe using a specific function and conditions? Tidyverse/R
First of all, I couldn't find a question related with my issue, apologies if this question was already answered.首先,我找不到与我的问题相关的问题,如果已经回答了这个问题,我深表歉意。
I have a dataframe with some columns and I want to calculate a new value using a specific ecuation.我有一个包含一些列的数据框,我想使用特定的 ecation 计算一个新值。 I guess I have to use mutate()
from tidyverse, but I want to avoid rows/samples where there're one or more 0 values.我想我必须使用 tidyverse 中的mutate()
,但我想避免有一个或多个 0 值的行/样本。 I don't know how I can check if there's any 0 when I'm using mutate()
.我不知道在使用mutate()
时如何检查是否有 0 。 Also, I don't know how I can apply my specific formula to create the new column.另外,我不知道如何应用我的特定公式来创建新列。
I leave here a code to create a dataframe as an example of my issue.我在这里留下一个代码来创建一个数据框作为我的问题的一个例子。
set.seed(123)
df <- data.frame(
time = seq(now(), now()+hours(11),by='hours'),
a = sample(0:100,12),
b = sample(0:100,12),
c = sample((0:20)/1000,12))
df[1:3,]$a <- 0
df[3:5,]$b <- 0
df[3:4,]$c <- 0
# function: M = a*b+(1-e^(-c/2))
# if any 0 in the row -> M = NA
# else: apply function
The function could be written as函数可以写成
a*b*(1-exp(-c/2))
The final df should have 4 colums per each hour (row) (a,b,c and the new calculated M), but when a | b | c == 0, M = NA
最终的 df 每小时(行)应该有 4 列(a、b、c 和新计算的 M),但是当a | b | c == 0, M = NA
a | b | c == 0, M = NA
a | b | c == 0, M = NA
. a | b | c == 0, M = NA
。
I will be very grateful for every little help.我将非常感谢每一个小小的帮助。 Cheers!干杯!
EDIT: The real function is more complex that this example, so it will not be always true that if one term (a,b,c,...) is 0, the resulting M is 0. Sorry, I didn't realised this postulate is true for the simplified equation.编辑:真正的函数比这个例子更复杂,所以如果一个术语 (a,b,c,...) 是 0,结果 M 是 0,这并不总是正确的。对不起,我没有意识到这个假设对于简化方程是正确的。 But I want to avoid any 0 value because they are from monitoring physiological variables and I know if one value is 0 in a sample, then the sample is wrong, so NA.但我想避免任何 0 值,因为它们来自监测生理变量,我知道如果样本中的一个值是 0,那么样本是错误的,所以不适用。
If any of a
, b
or c
is 0 it returns M
as 0 which can be changed to NA
.如果a
、 b
或c
任何a
为 0 ,则将M
返回为 0 ,可以将其更改为NA
。
library(dplyr)
df %>%
mutate(M = a*b*(1-exp(-c/2)),
M = na_if(M, 0))
# time a b c M
#1 2021-10-18 19:41:56 0 90 0.013 NA
#2 2021-10-18 20:41:56 0 56 0.016 NA
#3 2021-10-18 21:41:56 0 0 0.000 NA
#4 2021-10-18 22:41:56 13 0 0.000 NA
#5 2021-10-18 23:41:56 66 0 0.011 NA
#6 2021-10-19 00:41:56 41 71 0.014 20.305847
#7 2021-10-19 01:41:56 49 25 0.009 5.500115
#8 2021-10-19 02:41:56 42 6 0.012 1.507473
#9 2021-10-19 03:41:56 97 41 0.017 33.661237
#10 2021-10-19 04:41:56 24 97 0.008 9.293401
#11 2021-10-19 05:41:56 89 82 0.019 69.002718
#12 2021-10-19 06:41:56 68 35 0.015 17.783230
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.