简体   繁体   中英

python, deleting characters from a line, but leaving numbers and special symbols

I need to delete all chars from a string, except numbers and special symbols. Exapmle "asdasd 289(222):310" should result in "289(222):310" . How do I do this ?

You could delete the letters,

>>> import re
>>> s = "asdasd 289(222):310"
>>> m = re.sub(r'[A-Za-z]+', r'', s)
>>> m
' 289(222):310'

If you want to delete spaces also then try the below code,

>>> m = re.sub(r'[A-Za-z ]+', r'', s)
>>> m
'289(222):310'

You could check each character to see if it is an alphabetic character.

>>> s = "asdasd 289(222):310"
>>> "".join(i for i in s if not i.isalpha())
' 289(222):310'

If you'd like to remove the leading and trailing whitespace, tack on a .strip()

>>> "".join(i for i in s if not i.isalpha()).strip()
'289(222):310'

The string class has methods isalpha() and isdigit() which are useful for things like this.

>>> '2'.isdigit()
True
>>> '2'.isalpha()
False
>>> 'a'.isdigit()
False
>>> 'a'.isalpha()
True

If, for whatever reason, speed is of the essence, the following code might help:

from string import maketrans
trans1 = maketrans("abcdefghijklmnopqrstuvwxyzABCDEFGHOJKLMNOPQRSTUVWXYZ", " "*52)
s = "asdasd 289(222):310"
m = s.translate(trans1).replace(" ", "")

Timings with ipython's %timeit show me 1.2 usec for this approach, 3.3 usec for the regex posted by Avinash Raj and 8sec for Cyber's method (on a Windows 8.1 64 bit Python 2.7.8 64 bit machine).

Using .strip insted of .replace is faster (~900 ns) but won't replace in between spaces.

Of course, timings do depend on the kind of data that will be processed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM