在 R data.table 中分离正负值（多列）

Question

I was wondering how I could split many columns by their sign in a data.table.我想知道如何通过 data.table 中的符号来拆分许多列。 To be concrete, suppose that we have:具体来说，假设我们有：

library(data.table)
DT = data.table(x = c(-1,-2,1,3),
                z = c(-1,-1,-1,-1))

I am looking to create a new data.table called DT_new such that it looks like:我希望创建一个名为DT_new的新 data.table ，它看起来像：


 DT_new
    x  z x_pos x_neg z_pos z_neg
1: -1 -1     0     1     0     1
2: -2 -1     0     2     0     1
3:  1 -1     1     0     0     1
4:  3 -1     3     0     0     1

The reason I am doing this is that I want to separate out the positive and negative variables in a regression.我这样做的原因是我想在回归中分离出正变量和负变量。 Doing a few of these manually is easy enough.手动执行其中一些操作很容易。 But I have hundreds of variables that I want to apply this technique to.但是我有数百个变量想要应用这种技术。 So I am hoping that there is a "SDcols" solutions.所以我希望有一个“SDcols”解决方案。

Thanks!谢谢！

Answer 1

Perhaps:也许：

library(data.table)
DT = data.table(x = c(-1,-2,1,3),
                z = c(-1,-1,-1,-1))

col_nms <- c('x', 'z')
pos_nms <- paste0(col_nms, '_pos')
neg_nms <- paste0(col_nms, '_neg')

DT[, c(pos_nms) := lapply(.SD, function(.x) fifelse(.x > 0, .x, 0)), .SDcols = c('x', 'z')]
DT[, c(neg_nms) := lapply(.SD, function(.x) fifelse(.x < 0, -.x, 0)), .SDcols = c('x', 'z')]

DT
#>     x  z x_pos z_pos x_neg z_neg
#> 1: -1 -1     0     0     1     1
#> 2: -2 -1     0     0     2     1
#> 3:  1 -1     1     0     0     1
#> 4:  3 -1     3     0     0     1

^{Created on 2021-11-27 by the reprex package (v2.0.1)}^{由代表 package (v2.0.1) 于 2021 年 11 月 27 日创建}

Answer 2

No need to use.SDcols;-) Please find below a reprex:无需使用.SDcols;-) 请在下面找到一个代表：

Code代码

DT[,`:=` (x_pos = fifelse(x>0, x, 0),
          x_neg = fifelse(x<0, abs(x), 0),
          z_pos = fifelse(z>0, z, 0),
          z_neg = fifelse(z<0, abs(z), 0))][]

Output Output

    x  z x_pos x_neg z_pos z_neg
1: -1 -1     0     1     0     1
2: -2 -1     0     2     0     1
3:  1 -1     1     0     0     1
4:  3 -1     3     0     0     1

Answer 3

We could use across with case_when :我们可以将case_when across使用：

library(dplyr)
DT %>% 
  mutate(across(everything(), ~case_when(
    . < 0 ~ 0,
    TRUE ~ .), .names = "{col}_pos")) %>% 
  mutate(across(-contains("pos"), ~case_when(
    . < 0 ~ abs(.),
    TRUE ~ 0), .names = "{col}_neg"))

    x  z x_pos z_pos x_neg z_neg
1: -1 -1     0     0     1     1
2: -2 -1     0     0     2     1
3:  1 -1     1     0     0     1
4:  3 -1     3     0     0     1

Answer 4

library(data.table)
DT = data.table(x = c(-1,-2,1,3),
                z = c(-1,-1,-1,-1))

vars <- names(DT)

DT <- DT[, sapply(.SD, function(j){
  list(ifelse(j<0, 0, j), 
       ifelse(j>0, 0, j))
})]

setnames(DT, paste(rep(vars, each=2), c("_pos", "_neg"), sep=""))

Answer 5

Another dplyr option另一个 dplyr 选项

library(data.table)
library(dplyr, warn.conflicts = F)
DT <- data.table(x = c(-1, -2, 1, 3),
                 z = c(-1, -1, -1, -1))

DT %>%
  mutate(across(everything(), list(
    pos = ~ if_else(. > 0, ., 0),
    neg = ~ if_else(. < 0, -., 0)
  )))
#>     x  z x_pos x_neg z_pos z_neg
#> 1: -1 -1     0     1     0     1
#> 2: -2 -1     0     2     0     1
#> 3:  1 -1     1     0     0     1
#> 4:  3 -1     3     0     0     1

^{Created on 2021-11-27 by the reprex package (v2.0.1)}^{由代表 package (v2.0.1) 于 2021 年 11 月 27 日创建}

If you want to use this syntax but do data.table operations under the hood, you can use tidytable .如果您想使用此语法但在后台执行 data.table 操作，则可以使用tidytable 。

library(data.table)
library(tidytable, warn.conflicts = F)
DT <- data.table(x = c(-1, -2, 1, 3),
                 z = c(-1, -1, -1, -1))

DT %>%
  mutate.(across.(everything(), list(
    pos = ~ if_else(. > 0, ., 0),
    neg = ~ if_else(. < 0, -., 0)
  )))
#> # A tidytable: 4 × 6
#>       x     z x_pos x_neg z_pos z_neg
#>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1    -1    -1     0     1     0     1
#> 2    -2    -1     0     2     0     1
#> 3     1    -1     1     0     0     1
#> 4     3    -1     3     0     0     1

^{Created on 2021-11-27 by the reprex package (v2.0.1)}^{由代表 package (v2.0.1) 于 2021 年 11 月 27 日创建}

在 R data.table 中分离正负值（多列）

问题描述

5 个解决方案

解决方案1
2 2021-11-27 22:30:50

解决方案2
2 2021-11-27 22:32:42

解决方案3
2 2021-11-27 22:34:55

解决方案4
0 2021-11-27 22:43:47

解决方案5
0 2021-11-28 01:38:42

在 R data.table 中分离正负值（多列）

问题描述

5 个解决方案

解决方案1 2 2021-11-27 22:30:50

解决方案2 2 2021-11-27 22:32:42

解决方案3 2 2021-11-27 22:34:55

解决方案4 0 2021-11-27 22:43:47

解决方案5 0 2021-11-28 01:38:42

解决方案1
2 2021-11-27 22:30:50

解决方案2
2 2021-11-27 22:32:42

解决方案3
2 2021-11-27 22:34:55

解决方案4
0 2021-11-27 22:43:47

解决方案5
0 2021-11-28 01:38:42