简体   繁体   中英

Python line.split to include a whitespace

If I have a string and want to return a word that includes a whitespace how would it be done?

For example, I have:

line = 'This is a group of words that include #this and @that but not ME ME'

response = [ word for word in line.split() if word.startswith("#") or  word.startswith('@')  or word.startswith('ME ')]

print response ['#this', '@that', 'ME']

So ME ME does not get printed because of the whitespace.

Thanks

From python Documentation:

string.split(s[, sep[, maxsplit]]) : Return a list of the words of the string s. If the optional second argument sep is absent or None, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed).

so your error is first on the call for split.

print line.split() ['This', 'is', 'a', 'group', 'of', 'words', 'that', 'include', '#this', 'and', '@that', 'but', 'not', 'ME', 'ME']

I recommend to use re for splitting the string. Use the re.split(pattern, string, maxsplit=0, flags=0)

You could just keep it simple:

line = 'This is a group of words that include #this and @that but not ME ME'

words = line.split()

result = []

pos = 0
try:
    while True:
        if words[pos].startswith(('#', '@')):
            result.append(words[pos])
            pos += 1
        elif words[pos] == 'ME':
            result.append('ME ' + words[pos + 1])
            pos += 2
        else:
            pos += 1
except IndexError:
    pass

print result

Think about speed only if it proves to be too slow in practice.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM