繁体   English   中英

根据另一列的部分添加因子列

[英]Adding a factor column based on parts of another column

我有一些看起来像这样的数据:

SS <- structure(list(rn = 
c("Exp.618.1.7..ABC.TRE854.HS.2...1.Saline...1...A.", 
"Exp.618.1.7..ABC.TRE854.HS.2...4.Res..Reference...1...A.", "Exp.618.1.7..ABC.TRE854.HS.2...8.ABC.TRE854.HS.2..100nM...1...A.", 
"Exp.618.1.7..ABC.TRE854.HS.2...12.ABC.TRE854.HS.2..1.00uM...1...A.", 
"Exp.618.1.7..ABC.TRE854.HS.2...16.ABC.TRE854.HS.2..10.0uM...1...A.", 
"Exp.618.2.5..ABC.TRE854.HS.2...1.Saline...1...A.", "Exp.618.2.5..ABC.TRE854.HS.2...4.Res..Reference...1...A.", 
"Exp.618.2.5..ABC.TRE854.HS.2...8.ABC.TRE854.HS.2..300nM...1...A.", 
"Exp.618.2.5..ABC.TRE854.HS.2...12.ABC.TRE854.HS.2..3.0uM...1...A.", 
"Exp.618.2.5..ABC.TRE854.HS.2...16.ABC.TRE854.HS.2..30uM...1...A.", 
"Exp.622.1.2..ABC.TRE854.HS.2...1.Saline...1...A.", "Exp.622.1.2..ABC.TRE854.HS.2...4.Res..Reference...1...A.", 
"Exp.622.1.2..ABC.TRE854.HS.2...8.ABC.TRE854.HS.2..100nM...1...A.", 
"Exp.622.1.2..ABC.TRE854.HS.2...12.ABC.TRE854.HS.2..1.00uM...1...A.", 
"Exp.622.1.2..ABC.TRE854.HS.2...16.ABC.TRE854.HS.2..10.0uM...1...A.", 
"Exp.622.2.5..ABC.TRE854.HS.2...1.Saline...1...A.", "Exp.622.2.5..ABC.TRE854.HS.2...4.Res..Reference...1...A.", 
"Exp.622.2.5..ABC.TRE854.HS.2...8.ABC.TRE854.HS.2..300nM...1...A.", 
"Exp.622.2.5..ABC.TRE854.HS.2...12.ABC.TRE854.HS.2..3.0uM...1...A.", 
"Exp.622.2.5..ABC.TRE854.HS.2...16.ABC.TRE854.HS.2..30uM...1...A."
), V1 = c(6.08174172247795, -273.068131175906, -38.0098754654436, 
-44.1874819464636, -126.058280657819, 28.7111941404515, -326.124708404277, 
-61.0348906065704, -63.7440680070101, -62.8961106505329, 18.9484530926351, 
-607.977222113268, -212.18247673418, -179.193611578799, -230.372071747453, 
11.6278896202125, -258.129269330527, -26.634614887808, -29.8940173506221, 
-63.2992704853608), Exp = c("Exp.618.1.", "Exp.618.1.", "Exp.618.1.", 
"Exp.618.1.", "Exp.618.1.", "Exp.618.2.", "Exp.618.2.", "Exp.618.2.", 
"Exp.618.2.", "Exp.618.2.", "Exp.622.1.", "Exp.622.1.", "Exp.622.1.", 
"Exp.622.1.", "Exp.622.1.", "Exp.622.2.", "Exp.622.2.", "Exp.622.2.", 
"Exp.622.2.", "Exp.622.2."), Value_norm = c(-0.0222718839298028, 
1, 0.139195574751849, 0.16181852402981, 0.461636735546466, -0.0880374697180561, 
1, 0.187151997483457, 0.195459179768711, 0.192859078228946, -0.0311663865083172, 
1, 0.348997411443565, 0.294737376765432, 0.3789156293499, -0.0450467692035472, 
1, 0.103183242089851, 0.115810258279326, 0.245223142069596)), .Names = c("rn", 
"V1", "Exp", "Value_norm"), row.names = c(NA, 20L), class = "data.frame")

在rn列中,我需要使用一些名称来创建因子,以便可以在GGplot2中进行绘制。 这些名称是:

Saline
Reference
100nM
300nM
1uM
3uM
10uM
30uM

我希望最终数据看起来像示例一样,但最后要有一个因素列,上面有这些标签之一。

抱歉,我仅拥有我的数据图片,但我希望它能很好地格式化,而我无法在此处的对话框中这样做!

提前致谢!

好吧,如果您完全匹配该列中的术语,将会更容易。 如果可以的话,你可以做

rx <- "\\b(Saline|Reference|100nM|300nM|1.00uM|3.0uM|10.0uM|30uM)\\b"
SS$type <- regmatches(SS$rn,regexpr(rx, SS$rn))

这应该给出

c("1.00uM", "10.0uM", "100nM", "3.0uM", "300nM", "30uM", "Reference", "Saline")

如果要重命名不同的名称,可以执行

remap <- c("1.00uM"="1uM", "3.0uM"="3uM", "10.0uM"="10uM")
SS$type[SS$type %in% names(remap)] <- remap[SS$type[SS$type %in% names(remap)]]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM