简体   繁体   中英

Regex only searching for words

I have this regex that works fine in http://regexpal.com/ :

[^-:1234567890/.,\s]*

I am trying to find in a paragraph full of ( , . # "" \\n \\s ...etc) just the words

but in my code i cannot see the result i am specting:

def words(lines):
    words_pattern = re.compile(r'[^-:1234567890/.,\s]*')
    li = []
    for m in lines:
        e = words_pattern.search(m)
        if e:
            match = e.group()
            li.append(match)
    return li

li = [u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'']

Any advice on this? maybe i am not traspassing the regex in the right way from one place to another

Thanks in advance

EDIT

To be more precise i do want: ñ á é í ó and ú

thanks

If you just want the letters you could use string.ascii_letters

>>> from string import ascii_letters
>>> import re
>>> s = 'this is 123 some text! that has someñ \n other stuff.'
>>> re.findall('[{}]+'.format(ascii_letters), s)
['this', 'is', 'some', 'text', 'that', 'has', 'some', 'other', 'stuff']

You can also get the same behavior from [A-Za-z] (which is essentially the same thing as string.ascii_letters )

>>> re.findall('[A-Za-z]+', s)
['this', 'is', 'some', 'text', 'that', 'has', 'some', 'other', 'stuff']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM