python string to list - list comprehension

Question

The input is a string and the output is a list, each cell contains the corresponding word. Word is defined to be a sequence of letters and/or numbers. For example, Ilove is a word, 45tgfd is a word, 54fss. isn't a word because it has . .

Let us assume that commas come only after a word.

For example - 'Donald John Trump, born June 14, 1946, is the 45th' should become ['Donald', 'John', 'Trump', 'born', 'June', '14', '1946', 'is', 'the', '45th']

Tried doing it with [x.rstrip(',') for x in line.split() if x.rstrip(',').isalpha() or x.rstrip(',').isdigit()] when line is the original string, however it became messy and wrong - couldn't detect '45th' because of isdigit and isalpha .

any idea?

Answer 1

You are looking for str.isalnum :

>>> [x for x in (s.rstrip(',') for s in line.split()) if x.isalnum()]
['Donald', 'John', 'Trump', 'born', 'June', '14', '1946', 'is', 'the', '45th']
>>>

Notice, too, I'm not redundantly calling rstrip by using a generator expression inside the comprehension, this also let's me do only single pass on line.split() .

Answer 2

>>> import re

>>> s = 'Donald John Trump, born June 14, 1946, is the 45th'
>>> [i.strip(',') for i in re.split(r'\s+',s) if not re.search(r'^[\.]|\w+\.\w+|[\.]$',i)]
['Donald', 'Trump', 'born', 'June', '14', '1946', 'is', 'the', '45th']

>>> s2 = 'tes.t .test test. another word'
>>> [i.strip(',') for i in re.split(r'\s+',s2) if not re.search(r'^[\.]|\w+\.\w+|[\.]$',i)]
['another', 'word']

python string to list - list comprehension

Question

2 answers

solution1
2 2017-04-27 22:53:16

solution2
1 2017-04-27 22:51:28

python string to list - list comprehension

Question

2 answers

solution1 2 2017-04-27 22:53:16

solution2 1 2017-04-27 22:51:28

solution1
2 2017-04-27 22:53:16

solution2
1 2017-04-27 22:51:28