[英]String Pattern in R
I have a list of strings as follow: "/home/ricardo/MultiClass/data//F10/1036.txt"我有一个字符串列表如下:“/home/ricardo/MultiClass/data//F10/1036.txt”
> library(stringr)
> strsplit(cls[1], split= "/")
Give me:给我吗:
#> [[1]] [1] "" "home" "ricardo" "MultiClass" "data"
#> "" "F10" "1036.txt"
How can I keep only the 7th position?我怎样才能只保持第 7 位?
#> "F10"
If you want to extract one or more chars after //
up to the first /
or end of string use如果要在
//
之后提取一个或多个字符,直到字符串的第一个/
或结尾,请使用
> library(stringr)
> s <- "/home/ricardo/MultiClass/data//F10/1036.txt"
> str_extract(s, "(?<=//)[^/]+")
[1] "F10"
The (?<=//)[^/]+
regex pattern will find a position that is preceded with 2 slashes (see (?<=//)
) and then matches one or more characters other than /
(see [^/]+
). (?<=//)[^/]+
正则表达式模式将找到前面有 2 个斜杠的位置(参见(?<=//)
),然后匹配除/
之外的一个或多个字符(参见[^/]+
)。
A base R solution with sub
will look like带有
sub
基本 R 解决方案看起来像
> sub("^.*/([^/]*)/[^/]*$", "\\1", s)
[1] "F10"
Details :详情:
^
- start of string ^
- 字符串的开始.*
- any 0+ chars as many as possible .*
- 尽可能多的任意 0+ 个字符/
- a slash (last but one in the string as the previous pattern is greedy) /
- 斜线(字符串中的最后一个,因为前一个模式是贪婪的)([^/]*)
- capturing group #1 matching any 0+ chars other than /
([^/]*)
- 捕获组 #1 匹配除/
以外的任何 0+ 个字符/
- last slash /
- 最后一个斜线[^/]*
- any 0+ chars other than /
[^/]*
- 除/
之外的任何 0+ 个字符$
- end of string. $
- 字符串的结尾。It can be done in R-base in this way.可以通过这种方式在 R-base 中完成。 I have defined the function
gret
to extract a pattern from a string我已经定义了函数
gret
从字符串中提取模式
gret <-function(pattern,text,ignore.case=TRUE){
regmatches(text,regexpr(pattern,text,perl=TRUE,ignore.case))
then然后
gsub("data|/*","",gret("(?=data/).*(?<=/)","/home/ricardo/MultiClass
/data//F10/1036.txt"))
#>[1] "F10"
Using function word
of stringr
,使用
stringr
功能word
,
library(stringr)
word(sub('.*//', '', s), 1, sep = '/')
#[1] "F10"
#where
s <- '/home/ricardo/MultiClass/data//F10/1036.txt'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.