[英]Getting all characters ahead of first appearance of special character in R
I want to get all characters that are ahead of the first "." 我希望得到所有超过第一个“。”的角色。 if there is one. 如果有的话。 Otherwise, I want to get back the same character ("8" -> "8"). 否则,我想找回相同的字符(“8” - >“8”)。
Example: 例:
v<-c("7.7.4","8","12.6","11.5.2.1")
I want to get something like this: 我想得到这样的东西:
[1] "7 "8" "12" "11"
My idea was to split each element at "." 我的想法是将每个元素拆分为“。” and then only take the first split. 然后只进行第一次拆分。 I found no solution that worked... 我找不到有效的解决方案......
You can use sub
你可以使用sub
sub("\\..*", "", v)
#[1] "7" "8" "12" "11"
or a few stringi
options: 或几个stringi
选项:
library(stringi)
stri_replace_first_regex(v, "\\..*", "")
#[1] "7" "8" "12" "11"
# extract vs. replace
stri_extract_first_regex(v, "[^\\.]+")
#[1] "7" "8" "12" "11"
If you want to use a splitting approach, these will work: 如果您想使用拆分方法,这些方法将起作用:
unlist(strsplit(v, "\\..*"))
#[1] "7" "8" "12" "11"
# stringi option
unlist(stri_split_regex(v, "\\..*", omit_empty=TRUE))
#[1] "7" "8" "12" "11"
unlist(stri_split_fixed(v, ".", n=1, tokens_only=TRUE))
unlist(stri_split_regex(v, "[^\\w]", n=1, tokens_only=TRUE))
Other sub
variations that use a capture group to target the leading characters specifically: 其他使用捕获组来定位主要字符的sub
变体:
sub("(\\w+).+", "\\1", v) # \w matches [[:alnum:]_] (i.e. alphanumerics and underscores)
sub("([[:alnum:]]+).+", "\\1", v) # exclude underscores
# variations on a theme
sub("(\\w+)\\..*", "\\1", v)
sub("(\\d+)\\..*", "\\1", v) # narrower: \d for digits specifically
sub("(.+)\\..*", "\\1", v) # broader: "." matches any single character
# stringi variation just for fun:
stri_extract_first_regex(v, "\\w+")
scan()
would actually work well for this. scan()
实际上可以很好地工作。 Since we want everything before the first .
因为我们在第一个之前想要一切.
, we can use that as a comment character and scan()
will remove everything after and including that character, for each element in v
. ,我们可以将其用作注释字符,而scan()
将删除v
每个元素之后的所有内容。
scan(text = v, comment.char = ".")
# [1] 7 8 12 11
The above returns a numeric vector, which might be where you are headed. 上面的内容返回一个数字向量,可能就是你要去的地方。 If you need to stick with characters, add the what
argument to denote we want a character vector returned. 如果你需要坚持使用字符,添加what
参数表示我们想要返回一个字符向量。
scan(text = v, comment.char = ".", what = "")
# [1] "7" "8" "12" "11"
Data: 数据:
v <- c("7.7.4", "8", "12.6", "11.5.2.1")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.