Removing a string that does not contain letters from a list of strings in python

Question

I am making a text analyzer in python. I am trying to remove any string that does not contain any letters or integers from that list. I am stuck and do not know how to do so. Currently when counting the length of my list it is including the string '-' and I do not want it to because i don't want to count this as a word. However I'd rather not use string.remove('-') because I want it to work for other inputs.

Thanks in advance.

Answer 1

I think what you mean is you want to filter out strings with no alphanumeric characters from a list of strings. So ['a','b','*'] => ['a','b']

Not too hard:

In [39]: l = ['adsfg','sdfgb','gdc','56hjfg1','&#$%^',"asfgd3$#$%^" ]
In [40]: l = filter (lambda s:any([c.isalnum() for c in s]), l)
Out[41]:  ['adsfg', 'sdfgb', 'gdc', '56hjfg1', 'asfgd3$#$%^']

In [42]:

Answer 2

If you want to keep the strings with alphanumeric chars in them but that also contain non-alphanumeric chars:

import re

strings = ["string", "&*()£", "$^TY?", "12345", "2wE4T", "@#~\!", "^(*4"]

strings = [s for s in strings if re.search(r'\w+', s)] #  \w matches alphanumeric chars

print strings
['string', '$^TY?', '12345', '2wE4T', '^(*4'] # now we can work with these wanted strings

Otherwise, to keep only the strings entirely populated by and only by alphanumeric chars:

str.isalnum() is your man:

strings = [s for s in strings if s.isalnum()]
print strings
['string', '12345', '2wE4T']

More on re module:

https://docs.python.org/2/howto/regex.html

http://www.regular-expressions.info/tutorial.html

Removing a string that does not contain letters from a list of strings in python

Question

2 answers

solution1
2 2014-10-31 01:56:28

solution2
0 2014-10-31 02:12:17

Removing a string that does not contain letters from a list of strings in python

Question

2 answers

solution1 2 2014-10-31 01:56:28

solution2 0 2014-10-31 02:12:17

solution1
2 2014-10-31 01:56:28

solution2
0 2014-10-31 02:12:17