简体   繁体   English

基于某些转换标准的数据转换

[英]Data Transformations based on certain transformation criteria

I want to transform a dataset based on certain conditions. 我想根据某些条件转换数据集。 These conditions are given in another dataset. 这些条件在另一个数据集中给出。 Let me explain it using an example. 让我用一个例子来解释它。

Suppose I've a dataset in the following format: 假设我有以下格式的数据集:

Date       Var1 Var2
3/1/2016    8   14
3/2/2016    7   8
3/3/2016    7   6
3/4/2016    10  8
3/5/2016    5   10
3/6/2016    9   15
3/7/2016    2   5
3/8/2016    6   14
3/9/2016    8   15
3/10/2016   8   8

And the following dataset has the transformation conditions and is in the following format: 以下数据集具有转换条件,并且具有以下格式:

Variable    Trans1  Trans2
  Var1       1||2   0.5||0.7
  Var2       1||2   0.3||0.8

Now, I want to extract first conditions from transformation table for Var1, 1.0.5, and add 1 to Var1 and multiply it by 0.5. 现在,我要从转换表中提取Var1 1.0.5的第一个条件,并将1加到Var1中并乘以0.5。 I'll do the same for var2, add by 1 and multiply by 0.3. 我将对var2做同样的事情,将其加1并乘以0.3。 This transformation will give me new variable Var1_1 and var2_1. 这种转换将给我新的变量Var1_1和var2_1。 I'll do the same thing for the other transformation, which will give me Var1_2 and Var2_2. 对于其他转换,我将做同样的事情,这将给我Var1_2和Var2_2。 For Var1_2, the transformation is Var1 sum with 2 and multiplied by 0.7. 对于Var1_2,转换是Var1和2乘以0.7。

After the transformation, the dataset will look like the following: 转换后,数据集将如下所示:

  Date     Var1 Var2    Var1_1  Var2_1  Var1_2  Var2_2
3/1/2016    8   14       4.5     4.5      7      11.2
3/2/2016    7   8        4       2.7      6.3     7
3/3/2016    7   6        4       2.1      6.3     5.6
3/4/2016    10  8        5.5     2.7      8.4     7
3/5/2016    5   10       3       3.3      4.9     8.4
3/6/2016    9   15       5       4.8      7.7    11.9
3/7/2016    2   5        1.5     1.8      2.8     4.9
3/8/2016    6   14       3.5     4.5      5.6    11.2
3/9/2016    8   15       4.5     4.8      7      11.9
3/10/2016   8   8        4.5     2.7      7       7

Given that your original data.frame is called df and your conditions table cond1 then we can create a custom function, 假设您原来的data.frame称为df而条件表cond1则我们可以创建一个自定义函数,

funV1Cond1 <- function(x){
  t1 <- as.numeric(gsub("[||].*", "", cond1$Trans1[cond1$Variable == "Var1"]))
  t2 <- as.numeric(gsub("[||].*", "", cond1$Trans2[cond1$Variable == "Var1"]))
  result <- (x$Var1 + t1)*t2
  return(result)
}
funV1Cond1(df)
 #[1] 4.5 4.0 4.0 5.5 3.0 5.0 1.5 3.5 4.5 4.5

Same way with function 2 与功能2相同

funV1Cond2 <- function(x){
  t1 <- as.numeric(gsub(".*[||]", "", cond1$Trans1[cond1$Variable == "Var1"]))
  t2 <- as.numeric(gsub(".*[||]", "", cond1$Trans2[cond1$Variable == "Var1"]))
  result <- (x$Var1 + t1)*t2
  return(result)
}
funV1Cond2(df)
 #[1] 7.0 6.3 6.3 8.4 4.9 7.7 2.8 5.6 7.0 7.0

Assuming that Trans1 column has 3 conditions ie 1, 2, 3 then, 假设Trans1栏有3个条件,即1, 2, 3

as.numeric(sapply(str_split(cond1$Trans1[cond1$Variable == "Var1"], ','),function(x) x[2]))
#[1] 2
as.numeric(sapply(str_split(cond1$Trans1[cond1$Variable == "Var1"], ','),function(x) x[1]))
#[1] 1
as.numeric(sapply(str_split(cond1$Trans1[cond1$Variable == "Var1"], ','),function(x) x[3]))
#[1] 3

Note that I changed the delimeter to a ',' 请注意,我将分度符更改为“,”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM