繁体   English   中英

如何将一个df的列中出现的字符与另一个df的行值匹配?

[英]How to match a character that appears in column of one df with the row values of another df?

我正在尝试使用另一个数据框( df2 )获取与一个 dataframe ( df1 )中的每个字符对应的数值。 然后,我想从df2获取这些值并将它们添加到名为w0w1w2wfdf1的空列中。

我的问题是:

  • 谁能看到比我在下面尝试的方法更简单的问题解决方案(即从我的两个输入( df1df2 )到desiredoutput输出的更简单方法?
  • 如果没有,有人可以解释一下这个错误对我当前的代码意味着什么吗? 我在网上查找了此错误消息,但仍然感到困惑。

这是一个最小的可重现示例,包括我当前正在尝试的代码块以及与之相关的错误消息:

  • 输入(注意: <NA>*值是有意的,因为在我较大的数据框中有多个<NA>*值,所以我需要确保它有效):
df1 <- structure(list(startAA = c("A", "F", "G"), intermediateAA1 = c("T", 
"Q", NA), intermediateAA2 = c("S", "*", NA), finalAA = c("S", 
"S", "S"), w0 = c(NA, NA, NA), w1 = c(NA, NA, NA), w2 = c(NA, 
NA, NA), wf = c(NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-3L))

df2 <- structure(list(POSITION = "425", WT = "N", SITE_ENTROPY = "3.57", 
    PI_A = "0.156", PI_C = "0.017", PI_D = "0.02", PI_E = "0.009", 
    PI_F = "0.017", PI_G = "0.084", PI_H = "0.063", PI_I = "0.036", 
    PI_K = "0.167", PI_L = "0.088", PI_M = "0.157", PI_N = "0.005", 
    PI_P = "0.001", PI_Q = "0.072", PI_R = "0.043", PI_S = "0.019", 
    PI_T = "0.016", PI_V = "0.001", PI_W = "0.016", PI_Y = "0.005", 
    `PI_*` = "0.001"), class = "data.frame", row.names = c(NA, 
-1L))
  • 我当前使用以下软件包解决此问题的代码: tidyverseforeachdoParallel
registerDoParallel()
col <- colnames(df2) %>% str_subset("PI") %>% str_sub(-1)
foreach (i = 1:nrow(df1)) %dopar% {
  foreach(j = 1:(ncol(df1) - 4)) %do% {
    df1[i, j + 4] <- df2[str_which(col,as.character(df1[i,j])) + 3]
  }
}

运行此代码块时,我收到以下错误消息: Error in {: task 2 failed - "task 3 failed - "Syntax error in regex pattern. (U_REGEX_RULE_SYNTAX, context= Error in {: task 2 failed - "task 3 failed - "Syntax error in regex pattern. (U_REGEX_RULE_SYNTAX, context= * )""

  • 所需的 Output:
desiredoutput <- structure(list(startAA = c("A", "F", "G"), intermediateAA1 = c("T", 
"Q", NA), intermediateAA2 = c("S", "*", NA), finalAA = c("S", 
"S", "S"), w0 = c(0.156, 0.017, 0.084), w1 = c(0.016, 0.072, 
NA), w2 = c(0.019, 0.001, NA), wf = c(0.019, 0.019, 0.019)), class = "data.frame", row.names = c(NA, 
-3L))
  • 作为最后一条信息,这里是来自sessioninfo()的我的 R 环境的摘要
R version 4.1.3 (2022-03-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils    
[6] datasets  methods   base     

other attached packages:
 [1] micropan_2.1      igraph_1.3.2      microseq_2.1.5   
 [4] rlang_0.4.11      data.table_1.14.0 doParallel_1.0.17
 [7] iterators_1.0.14  foreach_1.5.2     combinat_0.0-8   
[10] forcats_0.5.1     stringr_1.4.0     dplyr_1.0.7      
[13] purrr_0.3.4       readr_2.0.0       tidyr_1.1.3      
[16] tibble_3.1.3      ggplot2_3.3.5     tidyverse_1.3.1  

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.1 haven_2.4.1      colorspace_2.0-2
 [4] vctrs_0.3.8      generics_0.1.0   utf8_1.2.2      
 [7] pillar_1.6.2     glue_1.4.2       withr_2.4.2     
[10] DBI_1.1.1        bit64_4.0.5      dbplyr_2.1.1    
[13] modelr_0.1.8     readxl_1.3.1     lifecycle_1.0.0 
[16] munsell_0.5.0    gtable_0.3.0     cellranger_1.1.0
[19] rvest_1.0.1      codetools_0.2-18 tzdb_0.1.2      
[22] fansi_0.5.0      broom_0.7.9      Rcpp_1.0.7      
[25] scales_1.1.1     backports_1.2.1  vroom_1.5.3     
[28] jsonlite_1.7.2   bit_4.0.4        fs_1.5.0        
[31] hms_1.1.0        stringi_1.7.3    grid_4.1.3      
[34] cli_3.0.1        tools_4.1.3      magrittr_2.0.1  
[37] crayon_1.4.1     pkgconfig_2.0.3  ellipsis_0.3.2  
[40] xml2_1.3.2       reprex_2.0.0     lubridate_1.7.10
[43] assertthat_0.2.1 httr_1.4.2       rstudioapi_0.13 
[46] R6_2.5.0         compiler_4.1.3  

感谢您的帮助

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM