简体   繁体   中英

Inconsistent behavior between str_split and strsplit

The documentation for str_split in the stringr package states that for the pattern argument:

If "" splits into individual characters.

which suggests it behaves the same as strsplit in this regard. However,

library(stringr)
str_split("abcab","")
[[1]]
[1] ""  "a" "b" "c" "a" "b"

with a leading empty string. This compares with,

strsplit("abcab","")
[[1]]
[1] "a" "b" "c" "a" "b"

Leading empty strings seems to be normal behavior when splitting on non-empty strings,

strsplit("abcab","ab")
[[1]]
[1] ""  "c"

but even then, str_split generates an 'extra' trailing empty string:

str_split("abcab","ab")
[[1]]
[1] ""  "c" "" 

Is this discrepancy a bug, feature, an error in the documentation or just a different notion of what's 'expected behavior'?

If you use commas as delimiters, the "expected" (your mileage may vary) result is more obvious:

# expect "" "2" "3" "4" ""

strsplit(",2,3,4,", ",")
# [[1]]
# [1] ""  "2" "3" "4"

str_split(",2,3,4,", ",")
# [[1]]
# [1] ""  "2" "3" "4" "" 

If I have n commas then I expect (n+1) elements to be returned. So I prefer the results from str_split . However, I wouldn't necessarily call this a bug in strsplit , since in performs as advertised:

(from ?strplit) Note that this means that if there is a match at the beginning of a (non-empty) string, the first element of the output is '""', but if there is a match at the end of the string, the output is the same as with the match removed.

"" is trickier, as there is no way to count the number of times "" appears in a string. Therefore treating it as a special case seems justified.

(from ?str_split) If '""' splits into individual characters.

Based on this I suggest you have found a bug and should take hadley's advice and report it!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM