[英]R add column values to dataframe based on rownames values
我需要向 dataframe 添加一列,以反映行名中编码的信息。 以下是说明问题的最小示例。
例子:
df <- data.frame(c(1,2,3,4),c(0,1,0,1))
colnames(df) <- c("Value","Ident")
rownames(df) <- c("fish1_101","fish1_102","fish2_103","fish2_104")
df
行名中编码的是关于每个样本的信息。 在此示例中,“fish1”前缀表示鲑鱼,而“fish2”前缀表示翻车鱼。
我需要添加一个新列“fish_species”,指定正确的鱼种。
试图:
key_df <- data.frame(c("fish1","fish2","fish3"),c("salmon","sunfish","halibut"))
colnames(key_df) <- c("key","species")
df["species"] <- apply(df, 1, function(x){
NameFound <- x[3]
NameFound_split <- unlist(strsplit(NameFound, "_"))
if (NameFound_split[1] == "fish1"){
out <- "salmon"
} else if (NameFound_split[1] == "fish2") {
out <- "sunfish"
} else if (NameFound_split[1] == "fish3") {
out <- "halibut"
}
return(out)
})
df <- df[,c(1,2,4)]
df # This is the desired result.
我正在寻找一种更清洁、更高吞吐量的方法来执行此操作,其中每个身份都不需要 if 语句。
df %>%
rownames_to_column('key')%>%
mutate(key = str_remove(key, '_.*'))%>%
left_join(key_df, by = 'key')
key Value Ident species
1 fish1 1 0 salmon
2 fish1 2 1 salmon
3 fish2 3 0 sunfish
4 fish2 4 1 sunfish
一个可能的解决方案:
library(tidyverse)
df <- data.frame(c(1,2,3,4),c(0,1,0,1))
colnames(df) <- c("Value","Ident")
rownames(df) <- c("fish1_101","fish1_102","fish2_103","fish2_104")
df %>%
rownames_to_column("fish_species") %>%
mutate(fish_species = if_else(str_detect(fish_species,"fish1"), "salmon", "sunfish"))
#> fish_species Value Ident
#> 1 salmon 1 0
#> 2 salmon 2 1
#> 3 sunfish 3 0
#> 4 sunfish 4 1
使用match
尝试以下基本 R 选项
df$species <- with(
key_df,
species[match(gsub("_.*", "", rownames(df)), key)]
)
你会得到
> df
Value Ident species
fish1_101 1 0 salmon
fish1_102 2 1 salmon
fish2_103 3 0 sunfish
fish2_104 4 1 sunfish
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.