[英]How to remove a character in specific columns in a data.frame with R?
I have a list results
of several data.frames (each data.frame has 3 columns). 我有几个data.frames的列表
results
(每个data.frame有3列)。 It looks like this : 看起来像这样:
> tail(results[[1]])
var1 var2 corr
4945 UniRef90_A0A075GGL3 UniRef90_A0A075GGW4 -0.12058932
4946 UniRef90_A0A075GGU1 UniRef90_A0A075GGW4 -0.01740142
4947 UniRef90_A0A075GGU4 UniRef90_A0A075GGW4 0.16400148
4948 UniRef90_A0A075GGV0 UniRef90_A0A075GGW4 -0.09698018
4949 UniRef90_A0A075GGV1 UniRef90_A0A075GGW4 0.22409572
4950 UniRef90_A0A075GGV8 UniRef90_A0A075GGW4 0.43184873
> tail(results[[2]])
var1 var2 corr
4945 UniRef90_A0A075GJW0 UniRef90_A0A075GKB8 -0.1059095
4946 UniRef90_A0A075GJW5 UniRef90_A0A075GKB8 -0.4336370
4947 UniRef90_A0A075GJX5 UniRef90_A0A075GKB8 -0.1875841
4948 UniRef90_A0A075GJY4 UniRef90_A0A075GKB8 0.2658149
4949 UniRef90_A0A075GJY8 UniRef90_A0A075GKB8 -0.2820792
4950 UniRef90_A0A075GJY9 UniRef90_A0A075GKB8 -0.2402827
I will bind these data.frames into only one. 我将这些data.frames绑定为一个。 But that'll give a huge data.frame.
但这会提供巨大的data.frame。 That's why I would like to remove the string
UniRef90_
in the columns var1
and var2
in order to reduce the size, before the binding. 这就是为什么我想在绑定之前删除
var1
和var2
列中的字符串UniRef90_
以减小大小。
Any help? 有什么帮助吗?
您可以在弯曲数据框之前在var1和var2上尝试此操作。
sub("UniRef90_","", dataframe$yourvariable)
We can loop through the list
, and remove the substring with either substring
or str_remove
我们可以遍历
list
,并使用substring
或str_remove
删除子substring
library(tidyverse)
map_df(results, ~ .x %>%
mutate_at(vars(matches('^var\\d+$')),
list(~ str_remove(., "^UniRef90_"))))
# var1 var2 corr
#1 A0A075GGL3 A0A075GGW4 -0.12058932
#2 A0A075GGU1 A0A075GGW4 -0.01740142
#3 A0A075GGU4 A0A075GGW4 0.16400148
#4 A0A075GGV0 A0A075GGW4 -0.09698018
#5 A0A075GGV1 A0A075GGW4 0.22409572
#6 A0A075GGV8 A0A075GGW4 0.43184873
#7 A0A075GJW0 A0A075GKB8 -0.10590950
#8 A0A075GJW5 A0A075GKB8 -0.43363700
#9 A0A075GJX5 A0A075GKB8 -0.18758410
#10 A0A075GJY4 A0A075GKB8 0.26581490
#11 A0A075GJY8 A0A075GKB8 -0.28207920
#12 A0A075GJY9 A0A075GKB8 -0.24028270
results <- list(structure(list(var1 = c("UniRef90_A0A075GGL3",
"UniRef90_A0A075GGU1",
"UniRef90_A0A075GGU4", "UniRef90_A0A075GGV0", "UniRef90_A0A075GGV1",
"UniRef90_A0A075GGV8"), var2 = c("UniRef90_A0A075GGW4",
"UniRef90_A0A075GGW4",
"UniRef90_A0A075GGW4", "UniRef90_A0A075GGW4", "UniRef90_A0A075GGW4",
"UniRef90_A0A075GGW4"), corr = c(-0.12058932, -0.01740142, 0.16400148,
-0.09698018, 0.22409572, 0.43184873)), class = "data.frame", row.names = c("4945",
"4946", "4947", "4948", "4949", "4950")),
structure(list(var1 = c("UniRef90_A0A075GJW0",
"UniRef90_A0A075GJW5", "UniRef90_A0A075GJX5", "UniRef90_A0A075GJY4",
"UniRef90_A0A075GJY8", "UniRef90_A0A075GJY9"), var2 = c("UniRef90_A0A075GKB8",
"UniRef90_A0A075GKB8", "UniRef90_A0A075GKB8", "UniRef90_A0A075GKB8",
"UniRef90_A0A075GKB8", "UniRef90_A0A075GKB8"), corr = c(-0.1059095,
-0.433637, -0.1875841, 0.2658149, -0.2820792, -0.2402827)),
class = "data.frame", row.names = c("4945",
"4946", "4947", "4948", "4949", "4950")))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.