简体   繁体   中英

Remove the last special characters of strings in a list

I have an array with words, some ending with special characters. I would like all the special characters at the end of the words to be deleted. Is there an elegant way to do it?

aArray=["palabra...","algo,.", "si ...", "onomatopeña", "asi;","www.google.com"]

output:

aArray=["palabra","algo", "si", "onomatopeña", "asi","www.google.com"]

I was trying this:

rxx = re.compile(r'(.*)([.,]{2,})')  # Extend [.,] as needed; {2,} means >= 2
aArray=["encontarla....", "esta,.", "sr.", "texto", 'www.google.com', 'encontrarla.']
aArray=([rxx.sub(lambda m: m.group(1), word) for word in a])

I think I did not understand at all. For example the string www.google.com as it is a url, should not elminate the dots.

You can use a regular expression to do that. Although your question is not very clear on the definition of 'special characters', but here is a sample code that gives the output that you posted:

import re

aArray=["palabra...","algo,.", "si ...", "onomatopeña", "asi;", "www.google.com"]

for i in range(len(aArray)):
    aArray[i] = re.sub(r'[.,;]+$', '', aArray[i]).strip()

Output:

['palabra', 'algo', 'si', 'onomatopeña', 'asi', 'www.google.com']

If by 'special character' you mean any non-alphanumeric, then you can use this:

import re

aArray=["palabra...","algo,.", "si ...", "onomatopeña", "asi;", "www.google.com"]

for i in range(len(aArray)):
    aArray[i] = re.sub(r'[^\w]+$', '', aArray[i]).strip()

Output:

['palabra', 'algo', 'si', 'onomatopeña', 'asi', 'www.google.com']

Also note the strip() , it is there to remove the trailing spaces

UPDATE

The $ at the end of regular expressions, means that we expect this pattern to be at the end and nothing else should be after it. So it can handle your URLs as well.

To strip all non-word characters only from the end of the strings:

import re

aArray = ["palabra...", "algo,.", "si ...", "onomatopeña", "asi;", "www.google.com"]

aArray = [re.sub(r'\W+$', '', s) for s in aArray]

Result:

['palabra', 'algo', 'si', 'onomatopeña', 'asi', 'www.google.com']

Explanation:

\\W+ matches any number of non-word characters, and $ anchors the match to the end of the string.

This could be done using a list comprehension and filter , without needing to use regex:

>>> aArray=["palabra...","algo,.", "si ...", "onomatopeña", "asi;","www.google.com"]
>>> [s.rstrip('.;, ') for s in aArray]
['palabra', 'algo', 'si', 'onomatopeña', 'asi', 'www.google.com']

Note I'm assuming '.;, ' are the all "special characters you're referring to.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM