简体   繁体   English

R 中的字符串模式

[英]String Pattern in R

I have a list of strings as follow: "/home/ricardo/MultiClass/data//F10/1036.txt"我有一个字符串列表如下:“/home/ricardo/MultiClass/data//F10/1036.txt”

>     library(stringr)   
>     strsplit(cls[1], split= "/")

Give me:给我吗:

#> [[1]] [1] ""           "home"       "ricardo"    "MultiClass" "data"  
#> ""           "F10"        "1036.txt"

How can I keep only the 7th position?我怎样才能只保持第 7 位?

#> "F10"

If you want to extract one or more chars after // up to the first / or end of string use如果要在//之后提取一个或多个字符,直到字符串的第一个/或结尾,请使用

> library(stringr) 
> s <- "/home/ricardo/MultiClass/data//F10/1036.txt"
> str_extract(s, "(?<=//)[^/]+")
[1] "F10"

The (?<=//)[^/]+ regex pattern will find a position that is preceded with 2 slashes (see (?<=//) ) and then matches one or more characters other than / (see [^/]+ ). (?<=//)[^/]+正则表达式模式将找到前面有 2 个斜杠的位置(参见(?<=//) ),然后匹配除/之外的一个或多个字符(参见[^/]+ )。

A base R solution with sub will look like带有sub基本 R 解决方案看起来像

> sub("^.*/([^/]*)/[^/]*$", "\\1", s)
[1] "F10"

Details :详情

  • ^ - start of string ^ - 字符串的开始
  • .* - any 0+ chars as many as possible .* - 尽可能多的任意 0+ 个字符
  • / - a slash (last but one in the string as the previous pattern is greedy) / - 斜线(字符串中的最后一个,因为前一个模式是贪婪的)
  • ([^/]*) - capturing group #1 matching any 0+ chars other than / ([^/]*) - 捕获组 #1 匹配除/以外的任何 0+ 个字符
  • / - last slash / - 最后一个斜线
  • [^/]* - any 0+ chars other than / [^/]* - 除/之外的任何 0+ 个字符
  • $ - end of string. $ - 字符串的结尾。

It can be done in R-base in this way.可以通过这种方式在 R-base 中完成。 I have defined the function gret to extract a pattern from a string我已经定义了函数gret从字符串中提取模式

gret <-function(pattern,text,ignore.case=TRUE){
    regmatches(text,regexpr(pattern,text,perl=TRUE,ignore.case))

then然后

gsub("data|/*","",gret("(?=data/).*(?<=/)","/home/ricardo/MultiClass
/data//F10/1036.txt"))


#>[1] "F10"

Using function word of stringr ,使用stringr功能word

library(stringr)
word(sub('.*//', '', s), 1, sep = '/')
#[1] "F10"

#where
s <- '/home/ricardo/MultiClass/data//F10/1036.txt'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM