
[英]Lookup values from 2 columns in one df in another df based on row and column names
[英]How to match a character that appears in column of one df with the row values of another df?
我正在尝试使用另一个数据框( df2
)获取与一个 dataframe ( df1
)中的每个字符对应的数值。 然后,我想从df2
获取这些值并将它们添加到名为w0
、 w1
、 w2
和wf
的df1
的空列中。
我的问题是:
df1
和df2
)到desiredoutput
输出的更简单方法?这是一个最小的可重现示例,包括我当前正在尝试的代码块以及与之相关的错误消息:
<NA>
和*
值是有意的,因为在我较大的数据框中有多个<NA>
和*
值,所以我需要确保它有效):df1 <- structure(list(startAA = c("A", "F", "G"), intermediateAA1 = c("T",
"Q", NA), intermediateAA2 = c("S", "*", NA), finalAA = c("S",
"S", "S"), w0 = c(NA, NA, NA), w1 = c(NA, NA, NA), w2 = c(NA,
NA, NA), wf = c(NA, NA, NA)), class = "data.frame", row.names = c(NA,
-3L))
df2 <- structure(list(POSITION = "425", WT = "N", SITE_ENTROPY = "3.57",
PI_A = "0.156", PI_C = "0.017", PI_D = "0.02", PI_E = "0.009",
PI_F = "0.017", PI_G = "0.084", PI_H = "0.063", PI_I = "0.036",
PI_K = "0.167", PI_L = "0.088", PI_M = "0.157", PI_N = "0.005",
PI_P = "0.001", PI_Q = "0.072", PI_R = "0.043", PI_S = "0.019",
PI_T = "0.016", PI_V = "0.001", PI_W = "0.016", PI_Y = "0.005",
`PI_*` = "0.001"), class = "data.frame", row.names = c(NA,
-1L))
tidyverse
、 foreach
、 doParallel
:registerDoParallel()
col <- colnames(df2) %>% str_subset("PI") %>% str_sub(-1)
foreach (i = 1:nrow(df1)) %dopar% {
foreach(j = 1:(ncol(df1) - 4)) %do% {
df1[i, j + 4] <- df2[str_which(col,as.character(df1[i,j])) + 3]
}
}
运行此代码块时,我收到以下错误消息: Error in {: task 2 failed - "task 3 failed - "Syntax error in regex pattern. (U_REGEX_RULE_SYNTAX, context=
Error in {: task 2 failed - "task 3 failed - "Syntax error in regex pattern. (U_REGEX_RULE_SYNTAX, context=
* )""
desiredoutput <- structure(list(startAA = c("A", "F", "G"), intermediateAA1 = c("T",
"Q", NA), intermediateAA2 = c("S", "*", NA), finalAA = c("S",
"S", "S"), w0 = c(0.156, 0.017, 0.084), w1 = c(0.016, 0.072,
NA), w2 = c(0.019, 0.001, NA), wf = c(0.019, 0.019, 0.019)), class = "data.frame", row.names = c(NA,
-3L))
sessioninfo()
的我的 R 环境的摘要R version 4.1.3 (2022-03-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils
[6] datasets methods base
other attached packages:
[1] micropan_2.1 igraph_1.3.2 microseq_2.1.5
[4] rlang_0.4.11 data.table_1.14.0 doParallel_1.0.17
[7] iterators_1.0.14 foreach_1.5.2 combinat_0.0-8
[10] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7
[13] purrr_0.3.4 readr_2.0.0 tidyr_1.1.3
[16] tibble_3.1.3 ggplot2_3.3.5 tidyverse_1.3.1
loaded via a namespace (and not attached):
[1] tidyselect_1.1.1 haven_2.4.1 colorspace_2.0-2
[4] vctrs_0.3.8 generics_0.1.0 utf8_1.2.2
[7] pillar_1.6.2 glue_1.4.2 withr_2.4.2
[10] DBI_1.1.1 bit64_4.0.5 dbplyr_2.1.1
[13] modelr_0.1.8 readxl_1.3.1 lifecycle_1.0.0
[16] munsell_0.5.0 gtable_0.3.0 cellranger_1.1.0
[19] rvest_1.0.1 codetools_0.2-18 tzdb_0.1.2
[22] fansi_0.5.0 broom_0.7.9 Rcpp_1.0.7
[25] scales_1.1.1 backports_1.2.1 vroom_1.5.3
[28] jsonlite_1.7.2 bit_4.0.4 fs_1.5.0
[31] hms_1.1.0 stringi_1.7.3 grid_4.1.3
[34] cli_3.0.1 tools_4.1.3 magrittr_2.0.1
[37] crayon_1.4.1 pkgconfig_2.0.3 ellipsis_0.3.2
[40] xml2_1.3.2 reprex_2.0.0 lubridate_1.7.10
[43] assertthat_0.2.1 httr_1.4.2 rstudioapi_0.13
[46] R6_2.5.0 compiler_4.1.3
感谢您的帮助
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.