Python convert list of multiple words to single words

Question

I have a list of words for example:

words = ['one','two','three four','five','six seven'] # quote was missing

And I am trying to create a new list where each item in the list is just one word so I would have:

words = ['one','two','three','four','five','six','seven']

Would the best thing to do be join the entire list into a string and then tokenize the string? Something like this:

word_string = ' '.join(words) tokenize_list = nltk.tokenize(word_string)

Or is there a better option?

Answer 1

words = ['one','two','three four','five','six seven']

With a loop:

words_result = []
for item in words:
    for word in item.split():
        words_result.append(word)

or as a comprehension:

words = [word for item in words for word in item.split()]

Answer 2

You can join using a space separator and then split again:

In [22]:

words = ['one','two','three four','five','six seven']
' '.join(words).split()
Out[22]:
['one', 'two', 'three', 'four', 'five', 'six', 'seven']

Answer 3

Here's a solution with a slight use of regular expressions:

import re

words = ['one','two','three four','five','six seven']
result = re.findall(r'[a-zA-Z]+', str(words))

Python convert list of multiple words to single words

Question

3 answers

solution1
10 2015-05-06 19:21:14

solution2
9 ACCPTED 2015-05-06 19:21:50

solution3
1 2015-05-06 20:26:24

Python convert list of multiple words to single words

Question

3 answers

solution1 10 2015-05-06 19:21:14

solution2 9 ACCPTED 2015-05-06 19:21:50

solution3 1 2015-05-06 20:26:24

solution1
10 2015-05-06 19:21:14

solution2
9 ACCPTED 2015-05-06 19:21:50

solution3
1 2015-05-06 20:26:24