简体   繁体   English

根据条件将函数应用于列的子集

[英]Applying function to a subset of columns depending on a conditional

I have a large data.table with hundreds of columns and thousands of rows. 我有一个大型的data.table其中包含数百列和数千行。 Most of the columns hold numeric values that are ratios like X/Y or Y/Z etc. 大多数列都包含数值,例如X / Y或Y / Z等比率。

I need to flip some of these ratios so that they are transformed from Y/Z -> Z/Y. 我需要翻转一些比率,以便从Y / Z-> Z / Y转换它们。 The only indicator I have of these columns is the column name which includes the substring "x/y"or "y/z". 我对这些列的唯一指示是列名,其中包括子字符串“ x / y”或“ y / z”。

I can get the columns that match "y/z" using grepl but I am not sure how I can use that array of logical values for apply / lapply etc. I realize that I can extract the columns (by logical indexing or .SDcols ) and transform them, but I don't want to discard/ignore the remaining columns. 我可以使用grepl获得与“ y / z”匹配的列,但是我不确定如何将逻辑值数组用于apply / lapply等。我意识到我可以提取列(通过逻辑索引或.SDcols )并转换它们,但我不想放弃/忽略其余的列。

Lastly, I have tried to something like this 最后,我尝试过这样的事情

flipcols <- grepl("Y/Z", names(sites))
sites.new <- sites[, , lapply(.SD, function(x) 1/x), .SDcols = flipcols]

but there is no difference between the sites and sites.new , the columns that should have been transformed are not transformed and the summed difference between corresponding columns is 0. sitessites.new之间没有差异,应该转换的列不会被转换,并且对应列之间的总和为0。

Suggestions? 建议?

EDIT: Following @akrun's I tried the := operator, but it leads to other issues as follow: 编辑:在@akrun之后,我尝试了:=运算符,但是它导致了其他问题,如下所示:

# I think this fails because flipcols is a logical vector and not a list of names or indices
> sites.new <- sites[, (flipcols) := lapply(.SD, function(x) 1/x), .SDcols = flipcols]
Error in `[.data.table`(sites, , `:=`((flipcols), lapply(.SD, function(x) 1/x)),  : 
  LHS of := isn't column names ('character') or positions ('integer' or 'numeric')


# and this seems to fail because .SDcols seems to lock the data in read-only mode
> sites.new <- sites[, which(flipcols) := lapply(.SD, function(x) 1/x), .SDcols = flipcols]
Error in assign(ii, SDenv$.SDall[[ii]], SDenv) : 
  cannot change value of locked binding for '.SD'

EDIT2: Here's a minimal example, the goal is to transform the columns which match "Y/Z" pattern (second and fourth in our minimal example here), while keeping the other columns unchanged and part of the result. EDIT2:这是一个最小的示例,目标是转换与“ Y / Z”模式匹配的列(在此最小示例中,第二和第四列), 同时保持其他列不变和部分结果。

> dt <- data.table(matrix(rnorm(25), 5,5))
> names(dt) <- c("X/Y_1", "Y/Z_1", "X/Y_2", "Y/Z_2", "X/Y_3")
> dt
        X/Y_1       Y/Z_1       X/Y_2      Y/Z_2      X/Y_3
1:  1.5972490 -0.01763484  1.10745607 -0.1416583 -0.4632829
2:  0.6629621 -0.82719204 -1.68214956  0.6145526 -0.8169235
3: -0.7491393 -0.05290791  0.63935066  1.0665537 -1.9107424
4: -0.6804972 -0.40107880 -0.01030063  1.4566075 -0.6866042
5:  0.2505391 -0.29091850 -1.95926987  0.8733446  1.3909565

Following your example, 按照您的示例,

library(data.table)
dt <- data.table(matrix(rnorm(25), 5,5))
names(dt) <- c("X/Y_1", "Y/Z_1", "X/Y_2", "Y/Z_2", "X/Y_3")
dt
         X/Y_1      Y/Z_1      X/Y_2       Y/Z_2       X/Y_3
1: -0.09845804 -0.6455857  0.2259012  1.26772833  1.14451170
2: -1.22147654  1.7643609  0.5310762 -0.46869816 -0.58761886
3: -0.61469060  1.2323381 -0.4028002  0.99903384  0.01650606
4: -0.80805337  0.2733621 -0.2855663 -0.02166544  0.59398122
5: -0.68398344  0.2891335 -0.5004021  2.12063769  0.40474155

I will first match the target columns 我将首先匹配目标列

sd.cols <- grep("Y/Z", names(dt), value = T)

Then, just changing the columns by reference, using standart data.table notation. 然后,只需使用data.table表示法通过引用更改列data.table

dt[ , (sd.cols) := lapply(.SD, function(x){x^-1}), .SDcols = sd.cols ]
         X/Y_1      Y/Z_1      X/Y_2       Y/Z_2       X/Y_3
1: -0.09845804 -1.5489811  0.2259012   0.7888125  1.14451170
2: -1.22147654  0.5667775  0.5310762  -2.1335693 -0.58761886
3: -0.61469060  0.8114656 -0.4028002   1.0009671  0.01650606
4: -0.80805337  3.6581513 -0.2855663 -46.1564513  0.59398122
5: -0.68398344  3.4586094 -0.5004021   0.4715563  0.40474155

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM