[英]How to use str_split() in R?
I want to split this string in several substrings: 我想将此字符串拆分为几个子字符串:
BAA33520.2|/gene="vpf402",/product="Vpf402"|GI:8272373|AB012574|join{7347:7965, 0:591}
BAA33520.2 | / gene =“ vpf402”,/ product =“ Vpf402” | GI:8272373 | AB012574 | join {7347:7965,0:591}
The separator is | 分隔符是| (ascii 124).
(ascii 124)。
It works with all other separators but not with this one. 它适用于所有其他分隔符,但不适用于此分隔符。
?regex
Two regular expressions may be joined by the infix operator
|
infix运算符可以将两个正则表达式连接起来
|
;; the resulting regular expression matches any string matching either subexpression.
生成的正则表达式与任何匹配子表达式的字符串匹配。 For example,
abba|cde
matches either the stringabba
or the stringcde
.例如,
abba|cde
与字符串abba
或字符串cde
。 Note that alternation does not work inside character classes, where|
请注意,替换在字符类中不起作用,其中
|
has its literal meaning.有其字面意义。
The fundamental building blocks are the regular expressions that match a single character.
基本构件是与单个字符匹配的正则表达式。 Most characters, including all letters and digits, are regular expressions that match themselves.
大多数字符(包括所有字母和数字)都是匹配自己的正则表达式。 Any metacharacter with special meaning may be quoted by preceding it with a backslash.
具有特殊含义的任何元字符都可以在其前面加上反斜杠来引用。 The metacharacters in extended regular expressions are
. \\ | ( ) [ { ^ $ * + ?
扩展正则表达式中的元字符为
. \\ | ( ) [ { ^ $ * + ?
. \\ | ( ) [ { ^ $ * + ?
, but note that whether these have a special meaning depends on the context.,但请注意,它们是否具有特殊含义取决于上下文。
Thus: 从而:
stringr::str_split('BAA33520.2|/gene="vpf402",/product="Vpf402"|GI:8272373|AB012574|join{7347:7965, 0:591}', "\\|")
As @ Frank noted, you can do this in base::strsplit()
by adding the fixed=TRUE
: 正如@ Frank指出的,您可以在
base::strsplit()
通过添加fixed=TRUE
:
strsplit('BAA33520.2|/gene="vpf402",/product="Vpf402"|GI:8272373|AB012574|join{7347:7965, 0:591}',"|", fixed=TRUE)
However, you can also do this with stringr::str_split()
by decorating the regular expression for the separator: 但是,您也可以使用
stringr::str_split()
来修饰分隔符的正则表达式:
stringr::str_split('BAA33520.2|/gene="vpf402",/product="Vpf402"|GI:8272373|AB012574|join{7347:7965, 0:591}',
regex("|", literal=TRUE))
Incidentally, stringr
is pretty much just a slightly friendlier wrapper to stringi
functions at this point and I highly recommend studying the stringi
package as it contains some wonderful gems outside of string spiltting. 顺便说一句,
stringr
是非常简单,只是一个稍微友好的包装来stringi
功能,在这一点上,我强烈建议学习stringi
包,因为它包含字符串spiltting外一些精彩的宝石。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.