[英]how to extract the numeric information from a list within a dataframe in r?
I have the following type of entries in the first column of a data frame called dfgModsPepFiltered_subset
: 我在名为
dfgModsPepFiltered_subset
的数据帧的第一列中具有以下类型的条目:
A640-P641 = 456.123x
Trying to extract the numeric information from this with the following R script: 尝试使用以下R脚本从中提取数字信息:
dfgModsPepFiltered_subset$AA <- regmatches(dfgModsPepFiltered_subset$Peptide,
gregexpr("[[:digit:]]+", dfgModsPepFiltered_subset$Peptide))
Gives me: 给我:
c("640", "641", "453", "123")
However, what I really need is a new column for each of "640"
, "641"
and "456.123"
. 但是,我真正需要的是为
"640"
, "641"
和"456.123"
每一个新的列。
I've tried various combinations of unlisting but can't seem to get the format right. 我尝试了各种不公开的组合,但似乎无法正确显示格式。
You could modify the regmatches
您可以修改
regmatches
as.data.frame(do.call(`rbind`,
lapply(regmatches(dfgModsPepFiltered_subset$Peptide,
gregexpr("[[:digit:].]+", dfgModsPepFiltered_subset$Peptide)),
as.numeric))
# V1 V2 V3
#1 640 641 456.123
#2 620 625 285.400
Or using extract
from tidyr
或使用
tidyr
extract
library(tidyr)
res <- extract(dfgModsPepFiltered_subset, Peptide, c('Col1', 'Col2', 'Col3'),
'[A-Z](\\d+)-[A-Z](\\d+) += +(\\d+\\.\\d+).+', convert=TRUE)
res
# Col1 Col2 Col3
#1 640 641 456.123
#2 620 625 285.400
Or you could use the regex
或者你可以使用
regex
extract(dfgModsPepFiltered_subset, Peptide, c('Col1', 'Col2', 'Col3'),
'[^0-9]+([0-9]+)[^0-9]+([0-9]+)[^0-9]+([0-9.]+)[^0-9]+')
Or 要么
library(splitstackshape)
res1 <- cSplit(dfgModsPepFiltered_subset, 'Peptide', '[^0-9.]', fixed=FALSE)
res1[,names(res1)[!colSums(is.na(res1))], with=FALSE]
# Peptide_2 Peptide_4 Peptide_7
#1: 640 641 456.123
#2: 620 625 285.400
Or using strsplit
或使用
strsplit
as.data.frame(t(sapply(strsplit(dfgModsPepFiltered_subset$Peptide,
'[^0-9.]'), function(x) na.omit(as.numeric(x)))))
# V1 V2 V3
#1 640 641 456.123
#2 620 625 285.400
dfgModsPepFiltered_subset <- data.frame(Peptide= c('A640-P641 = 456.123x',
'A620-B625 = 285.400x'), stringsAsFactors=FALSE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.