Splitting strings in Python using specific characters

Question

I'm trying to split an inputted document at specific characters. I need to split them at [ and ] but I'm having a difficult time figuring this out.

def main():
for x in docread:
    words = x.split('[]')
    for word in words:
        doclist.append(word)

this is the part of the code that splits them into my list. However, it is returning each line of the document.

For example, I want to convert

['I need to [go out] to lunch', 'and eat [some food].']

to

['I need to', 'go out', 'to lunch and eat', 'some food', '.']

Thanks!

Answer 1

You could try using re.split() instead:

>>> import re
>>> re.split(r"[\[\]]", "I need to [go out] to lunch")
['I need to ', 'go out', ' to lunch']

The odd-looking regular expression [\\[\\]] is a character class that means split on either [ or ] . The internal \\[ and \\] must be backslash-escaped because they use the same characters as the [ and ] to surround the character class.

Answer 2

str.split() splits at the exact string you pass to it , not at any of its characters. Passing "[]" would split at occurrences of [] , but not at individual brackets. Possible solutions are

splitting twice:

 words = [z for y in x.split("[") for z in y.split("]")]

using re.split() .

Answer 3

string.split(s), the one you are using, treats the entire content of 's' as a separator. In other words, you input should've looked like "[]'I need to []go out[] to lunch', 'and eat []some food[].'[]" for it to give you the results you want.

You need to use split(s) from the re module , which will treat s as a regex

import re

def main():
for x in docread:
    words = re.split('[]', x)
    for word in words:
        doclist.append(word)

Splitting strings in Python using specific characters

Question

3 answers

solution1
6 2011-11-20 18:34:41

solution2
2 2011-11-20 18:35:07

solution3
0 2011-11-20 18:38:03

Splitting strings in Python using specific characters

Question

3 answers

solution1 6 2011-11-20 18:34:41

solution2 2 2011-11-20 18:35:07

solution3 0 2011-11-20 18:38:03

solution1
6 2011-11-20 18:34:41

solution2
2 2011-11-20 18:35:07

solution3
0 2011-11-20 18:38:03