简体   繁体   English

使用列表中的匹配项或部分匹配项重命名数据框中的 row.name

[英]Rename row.name in data frame using matches or partial matches from a list

I have a data frame in R with 341 rows.我在 R 中有一个包含 341 行的数据框。 I want to rename the row names using a list with 349 names.我想使用包含 349 个名称的列表重命名行名称。 All 341 names will be in this list for sure.所有 341 个名字都肯定会出现在这个列表中。 But not all of them will be perfect hits.但并非所有这些都会是完美的命中。 The data looks like this数据看起来像这样

rownames(df_RPM1)
[1] "LQNS02059392.1_11686_5p"
[2] "LQNS02277998.1_30984_3p"
[3] "LQNS02277998.1_30984_5p"
[4] "LQNS02277998.1_30988_3p"
[5] "LQNS02277998.1_30988_5p"
[6] "LQNS02277997.1_30943_3p"
[7] "miR-9|LQNS02278070.1_31740_3p"
[8] "miR-9|LQNS02278094.1_36129_3p" 

head(inlist)
[1] "dpu-miR-2-03_LQNS02059392.1_11686_5p"  "dpu-miR-10-P2_LQNS02277998.1_30984_3p"
[3] "dpu-miR-10-P2_LQNS02277998.1_30984_5p" "dpu-miR-10-P3_LQNS02277998.1_30988_3p"
[5] "dpu-miR-10-P3_LQNS02277998.1_30988_5p" "miR-9|LQNS02278070.1_31740_3p" 
[6] "miR-9|LQNS02278094.1_36129_3p" 

The order won't necessarily be the same in the two.两者的顺序不一定相同。

Can anyone suggest me how to do this in R?谁能建议我如何在 R 中做到这一点? Thanks a lot非常感谢

Depends a lot what a "non-perfect hit" looks like.很大程度上取决于“非完美打击”的样子。 Assuming the row name is a substring of the real name, str_detect() does the job quite well:假设行名是真实姓名的子字符串, str_detect()就可以很好地完成这项工作:

library(tidyverse)
real_names <- c("dpu-miR-2-03_LQNS02059392.1_11686_5p",
                  "dpu-miR-10-P2_LQNS02277998.1_30984_3p",
                  "dpu-miR-10-P2_LQNS02277998.1_30984_5p",
                  "dpu-miR-10-P3_LQNS02277998.1_30988_3p",
                  "dpu-miR-10-P3_LQNS02277998.1_30988_5p",
                  "miR-9|LQNS02278070.1_31740_3p",
                  "miR-9|LQNS02278094.1_36129_3p")

str_which(real_names, "LQNS02059392.1_11686_5p")
#> [1]  1

So we can vectorize (I removed the element 6 which is not found in the example list):所以我们可以矢量化(我删除了示例列表中没有的元素 6):

pos <- map_int(rownames(df_RPM1), ~ str_which(real_names, fixed(.)))
pos
#> [1] 1 2 3 4 5 6 7

And all that's left is to change the row names:剩下的就是更改行名称:

rownames(df_RPM1) <- real_names[pos]

Of course, if a non-perfect hit means something more complicated, you may need to create a regex from the row names or something like that.当然,如果不完美的命中意味着更复杂的事情,您可能需要根据行名称或类似内容创建正则表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM