I want to know the pattern matching concept behind this code snippet:
split :: String -> Char -> [String]
split [] delim = [""]
split (c:cs) delim
| c == delim = "" : rest
| otherwise = (c : head rest) : tail rest
where
rest = split cs delim
I know that head
returns the 1st element of the list and tail
returns the rest. But I still cannot understand the functionality of this. This takes a string and breaks it into a list of strings from a given character.
Maybe it's clearer in the following form:
split [] delim = [""] -- a list containing only an empty String
split (c:cs) delim = let (firstWord:moreWords) = split cs delim
in if c == delim
then "" : firstWord : moreWords
else (c:firstWord) : moreWords
The function traverses the input string, comparing each character with the delimiter. If the current character is not the delimiting character, it is tacked on the front of the first word (which may be empty) resulting from splitting the remainder of the string, if it is the delimiting character, it adds an empty string to the front of the result of splitting the remainder.
For example, the evaluation of split "abc cde" ' '
proceeds like
split "abc cde" ' '
~> 'a' == ' ' ? No, next guard
~> ('a' : something) : somethingElse
where something
and somethingElse
will be determined later by splitting the remainder "bc cde". After looking at the first character, it's been determined that whatever the final result is, its first entry starts with
"bc cde". After looking at the first character, it's been determined that whatever the final result is, its first entry starts with
'a'`. Going on to determine the rest,
split "bc cde" ' '
~> ('b' : something1) : somethingElse1
where (something1 : somethingElse1) = split "c cde" ' '
So now the first two characters of the first entry of the result are known. Then from the next step it is determined that something1
starts with 'c'
. Then finally we reach a delimiter, that is the case where the first element of the result is determined without reference to later recursive calls, and only the remainder of the result remains to be found in the recursion.
Another way of formulating the algorithm is (thanks @dave4420 for the suggestion)
split input delim = foldr combine [""] input
where
combine c rest@(~(wd : wds))
| c == delim = "" : rest
| otherwise = (c : wd) : wds
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.