简体   繁体   English

如何提取R中字符向量的特定部分?

[英]how to extract specific part of character vector in R?

I have file names like this as a character vector;我有这样的文件名作为字符向量;

filenames <- c('1194-1220-1479-891--(133.07).RDS','1194-1221-1421-891--(101.51).RDS')

Don't want to have digits in pharanthesis and want to have digits "/" separated.不想在括号中有数字,并且希望将数字“/”分开。 So the desired output is;所以期望的输出是;

filenames_desired <- c('1194/1220/1479/891','1194/1221/1421/891')

I tried with gsub but didn't know how to remove digits in pharanthesis.我尝试使用gsub但不知道如何删除 pharanthesis 中的数字。

Thanks in advance提前致谢

Using stringr, looking around (?=-) meaning: has to be followed by a dash and sapply:使用 stringr,环顾四周(?=-)含义:后面必须跟一个破折号和 sapply:

filenames <- c('1194-1220-1479-891--(133.07).RDS','1194-1221-1421-891--(101.51).RDS')

sapply(stringr::str_extract_all(filenames, "\\d+(?=-)"), 
       paste0, 
       collapse = "/") 

[1] "1194/1220/1479/891" "1194/1221/1421/891"

We could use a single sub() call here:我们可以在这里使用一个sub()调用:

filenames <- c("1194-1220-1479-891--(133.07).RDS",
               "1194-1221-1421-891--(101.51).RDS")

output <- sub("(\\d+)-(\\d+)-(\\d+)-(\\d+).*", "\\1/\\2/\\3/\\4", filenames)
output

[1] "1194/1220/1479/891" "1194/1221/1421/891"

Just use gsub with \\--.* : removes everything after and including -- :只需将gsub\\--.*一起使用:删除包括--之后的所有内容:

gsub('\\--.*', '', filenames)

[1] "1194-1220-1479-891" "1194-1221-1421-891"

As I can see, the first 18 characters of your names are the base of your final names;如我所见,您名字的前 18 个字符是您最终名字的基础; so, you can use the following code所以,你可以使用下面的代码

# Initial names
  filenames <- ('1194-1220-1479-891--(133.07).RDS','1194-1221-1421-891--(101.51).RDS')

# Extract the first 18 elements of "filenames"
  nam <- substr(filenames, 1, 18)
# Replace "-" by "/"
  final.names <- str_replace_all(nam, "-", "/")

You could use strsplit to extract the first element from each list and then use gsub :您可以使用strsplit从每个列表中提取第一个元素,然后使用gsub

gsub('-', '/', sapply(strsplit(filenames, '--'), `[[`, 1))

which will yield这将产生

#"1194/1220/1479/891" "1194/1221/1421/891"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM