I have a data frame in R with 341 rows. I want to rename the row names using a list with 349 names. All 341 names will be in this list for sure. But not all of them will be perfect hits. The data looks like this
rownames(df_RPM1)
[1] "LQNS02059392.1_11686_5p"
[2] "LQNS02277998.1_30984_3p"
[3] "LQNS02277998.1_30984_5p"
[4] "LQNS02277998.1_30988_3p"
[5] "LQNS02277998.1_30988_5p"
[6] "LQNS02277997.1_30943_3p"
[7] "miR-9|LQNS02278070.1_31740_3p"
[8] "miR-9|LQNS02278094.1_36129_3p"
head(inlist)
[1] "dpu-miR-2-03_LQNS02059392.1_11686_5p" "dpu-miR-10-P2_LQNS02277998.1_30984_3p"
[3] "dpu-miR-10-P2_LQNS02277998.1_30984_5p" "dpu-miR-10-P3_LQNS02277998.1_30988_3p"
[5] "dpu-miR-10-P3_LQNS02277998.1_30988_5p" "miR-9|LQNS02278070.1_31740_3p"
[6] "miR-9|LQNS02278094.1_36129_3p"
The order won't necessarily be the same in the two.
Can anyone suggest me how to do this in R? Thanks a lot
Depends a lot what a "non-perfect hit" looks like. Assuming the row name is a substring of the real name, str_detect()
does the job quite well:
library(tidyverse)
real_names <- c("dpu-miR-2-03_LQNS02059392.1_11686_5p",
"dpu-miR-10-P2_LQNS02277998.1_30984_3p",
"dpu-miR-10-P2_LQNS02277998.1_30984_5p",
"dpu-miR-10-P3_LQNS02277998.1_30988_3p",
"dpu-miR-10-P3_LQNS02277998.1_30988_5p",
"miR-9|LQNS02278070.1_31740_3p",
"miR-9|LQNS02278094.1_36129_3p")
str_which(real_names, "LQNS02059392.1_11686_5p")
#> [1] 1
So we can vectorize (I removed the element 6 which is not found in the example list):
pos <- map_int(rownames(df_RPM1), ~ str_which(real_names, fixed(.)))
pos
#> [1] 1 2 3 4 5 6 7
And all that's left is to change the row names:
rownames(df_RPM1) <- real_names[pos]
Of course, if a non-perfect hit means something more complicated, you may need to create a regex from the row names or something like that.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.