[英]Data Transformations based on certain transformation criteria
I want to transform a dataset based on certain conditions. 我想根据某些条件转换数据集。 These conditions are given in another dataset.
这些条件在另一个数据集中给出。 Let me explain it using an example.
让我用一个例子来解释它。
Suppose I've a dataset in the following format: 假设我有以下格式的数据集:
Date Var1 Var2
3/1/2016 8 14
3/2/2016 7 8
3/3/2016 7 6
3/4/2016 10 8
3/5/2016 5 10
3/6/2016 9 15
3/7/2016 2 5
3/8/2016 6 14
3/9/2016 8 15
3/10/2016 8 8
And the following dataset has the transformation conditions and is in the following format: 以下数据集具有转换条件,并且具有以下格式:
Variable Trans1 Trans2
Var1 1||2 0.5||0.7
Var2 1||2 0.3||0.8
Now, I want to extract first conditions from transformation table for Var1, 1.0.5, and add 1 to Var1 and multiply it by 0.5. 现在,我要从转换表中提取Var1 1.0.5的第一个条件,并将1加到Var1中并乘以0.5。 I'll do the same for var2, add by 1 and multiply by 0.3.
我将对var2做同样的事情,将其加1并乘以0.3。 This transformation will give me new variable Var1_1 and var2_1.
这种转换将给我新的变量Var1_1和var2_1。 I'll do the same thing for the other transformation, which will give me Var1_2 and Var2_2.
对于其他转换,我将做同样的事情,这将给我Var1_2和Var2_2。 For Var1_2, the transformation is Var1 sum with 2 and multiplied by 0.7.
对于Var1_2,转换是Var1和2乘以0.7。
After the transformation, the dataset will look like the following: 转换后,数据集将如下所示:
Date Var1 Var2 Var1_1 Var2_1 Var1_2 Var2_2
3/1/2016 8 14 4.5 4.5 7 11.2
3/2/2016 7 8 4 2.7 6.3 7
3/3/2016 7 6 4 2.1 6.3 5.6
3/4/2016 10 8 5.5 2.7 8.4 7
3/5/2016 5 10 3 3.3 4.9 8.4
3/6/2016 9 15 5 4.8 7.7 11.9
3/7/2016 2 5 1.5 1.8 2.8 4.9
3/8/2016 6 14 3.5 4.5 5.6 11.2
3/9/2016 8 15 4.5 4.8 7 11.9
3/10/2016 8 8 4.5 2.7 7 7
Given that your original data.frame is called df
and your conditions table cond1
then we can create a custom function, 假设您原来的data.frame称为
df
而条件表cond1
则我们可以创建一个自定义函数,
funV1Cond1 <- function(x){
t1 <- as.numeric(gsub("[||].*", "", cond1$Trans1[cond1$Variable == "Var1"]))
t2 <- as.numeric(gsub("[||].*", "", cond1$Trans2[cond1$Variable == "Var1"]))
result <- (x$Var1 + t1)*t2
return(result)
}
funV1Cond1(df)
#[1] 4.5 4.0 4.0 5.5 3.0 5.0 1.5 3.5 4.5 4.5
Same way with function 2 与功能2相同
funV1Cond2 <- function(x){
t1 <- as.numeric(gsub(".*[||]", "", cond1$Trans1[cond1$Variable == "Var1"]))
t2 <- as.numeric(gsub(".*[||]", "", cond1$Trans2[cond1$Variable == "Var1"]))
result <- (x$Var1 + t1)*t2
return(result)
}
funV1Cond2(df)
#[1] 7.0 6.3 6.3 8.4 4.9 7.7 2.8 5.6 7.0 7.0
Assuming that Trans1
column has 3 conditions ie 1, 2, 3
then, 假设
Trans1
栏有3个条件,即1, 2, 3
,
as.numeric(sapply(str_split(cond1$Trans1[cond1$Variable == "Var1"], ','),function(x) x[2]))
#[1] 2
as.numeric(sapply(str_split(cond1$Trans1[cond1$Variable == "Var1"], ','),function(x) x[1]))
#[1] 1
as.numeric(sapply(str_split(cond1$Trans1[cond1$Variable == "Var1"], ','),function(x) x[3]))
#[1] 3
Note that I changed the delimeter to a ',' 请注意,我将分度符更改为“,”
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.