简体   繁体   中英

Strange behavior of strsplit() in R?

I would like to split the string x = "a,b," (comma at the last place) into the vector c("a","b","") using strsplit() .

The result is:

>strsplit(x,',')
[[1]]
[1] "a" "b"

I would like the have the third component (empty string or NULL).

The function read.csv(x) can manage that, but still I think that strsplit() should behave as I expected. Python gives c("a","b","") .

Maybe there is some option of strsplit() I do not know?

That's how it works and is documented in help(strsplit):

  Note that this means that if there is a match at the beginning of a (non-empty) string, the first element of the output is '""', but if there is a match at the end of the string, the output is the same as with the match removed. 

You might want to use str_split from the stringr package:

> require(stringr)
> str_split("a,b,",",")
[[1]]
[1] "a" "b" "" 

> str_split("a,b",",")
[[1]]
[1] "a" "b"

> str_split(",a,b",",")
[[1]]
[1] ""  "a" "b"

> str_split(",a,b,,,",",")
[[1]]
[1] ""  "a" "b" ""  ""  "" 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM