简体   繁体   中英

Removing letters from a list of both numbers and letters

In a function I'm trying to write, the user enters a bunch of numbers eg "648392". I turn this string into a list like this: ['6', '4', '8', '3', '9', '2'].

I wanted to be able to do sums with these numbers so I was turning the numbers in the list into integers rather than strings. This all worked fine, however I also wanted the user to be able to enter letters, and then I would just remove them from the list - and this is where I'm stuck. For example a user entering "6483A2".

I can't check to see if an element is a digit with isDigit because the elements apparently have to be integers first, and I can't convert the elements in the list to integers because some of the elements are letters... I'm sure there's a simple solution but I am pretty terrible at python, so any help would be much appreciated!

You can use str.translate to filter out letters:

>>> from string import letters
>>> strs = "6483A2"
>>> strs.translate(None, letters)
'64832'

There's no need to convert a string to a list, you can iterate over the string itself.

Using str.join , str.isdigit and list comprehension:

>>> ''.join([c for c in strs if c.isdigit()])
'64832'

or this as you want the sum of digits:

sum(int(c) for c in strs if c.isdigit())

Timing comparisons:

Tiny string:

>>> strs = "6483A2"
>>> %timeit sum(int(c) for c in strs.translate(None, letters))
100000 loops, best of 3: 9.19 us per loop
>>> %timeit sum(int(c) for c in strs if c.isdigit())
100000 loops, best of 3: 10.1 us per loop

Large string:

>>> strs = "6483A2"*1000
>>> %timeit sum(int(c) for c in strs.translate(None, letters))
100 loops, best of 3: 5.47 ms per loop
>>> %timeit sum(int(c) for c in strs if c.isdigit())
100 loops, best of 3: 8.54 ms per loop

Worst case, all letters:

>>> strs = "A"*100
>>> %timeit sum(int(c) for c in strs.translate(None, letters))
100000 loops, best of 3: 2.53 us per loop
>>> %timeit sum(int(c) for c in strs if c.isdigit())
10000 loops, best of 3: 24.8 us per loop
>>> strs = "A"*1000
>>> %timeit sum(int(c) for c in strs.translate(None, letters))
100000 loops, best of 3: 7.34 us per loop
>>> %timeit sum(int(c) for c in strs if c.isdigit())
1000 loops, best of 3: 210 us per loop

You can filter things out of any iterable (including a string) with the filter function, or a comprehension. For example, either of these:

digits = filter(str.isdigit, input_string)
digits = (character for character in input_string if character.isdigit())

… will give you an iterable full of digits. If you want to convert each one to a number, either of these will do it:

numbers = map(int, filter(str.isdigit, input_string))
numbers = (int(character) for character in input_string if character.isdigit())

So, to get the sum of all the digits, skipping the letters, just pass either of those to the sum function:

total = sum(map(int, filter(str.isdigit, input_string)))
total = sum(int(character) for character in input_string if character.isdigit())

From your last paragraph:

I can't check to see if an element is a digit with isDigit because the elements apparently have to be integers first, and I can't convert the elements in the list to integers

First, it's isdigit , not isDigit . Second, isdigit is a method on strings, not integers, so you're wrong in thinking that you can't call it on the strings. In fact, you must call it on the strings before converting them to integers.

But this does bring up another option. In Python, it's often Easier to Ask Forgiveness than Permission . Instead of figuring out whether we can convert each letter to an int, and then doing it, we can just try to convert it to an int, and then deal with the possible failure. For example:

def get_numbers(input_string):
    for character in input_string:
        try:
            yield int(character)
        except TypeError:
            pass

Now, it's just:

total = sum(get_numbers(input_string))

You can do this with a comprehension:

>>> s = "6483A2"
>>> [int(c) for c in s if c.isdigit()]
[6, 4, 8, 3, 2]
>>> sum(int(c) for c in s if c.isdigit())
23

This approach is good if you want to go straight from the mixed string to a list of only the integers, which is presumably your goal.

You can use a generator expression and put it in the sum .

>>> import string
>>> s
'6483A2'
>>> sum(int(x) for x in list(s) if x in string.digits)
23

If no other module want to import, use isdigit :

sum(int(x) for x in list(s) if x.isdigit())
>>> a = "hello123987io"
>>> b = "khj7djksh787"
>>> sum([int(letter) for letter in b if letter.isdigit()])
29
>>> sum([int(letter) for letter in a if letter.isdigit()])
30

>>> def getInputAndSum(userInput):
...     """ returns a tuple with the input string and its sum """
...     return userInput, sum([int(letter) for letter in userInput if letter.isdigit()])
... 
>>> getInputAndSum("Th1s1550mesum")
('Th1s1550mesum', 12)

Just to contribute a little, if you want the aggregated sum you can do it all like this:

x = "6483A2"
sum(map(int, filter(str.isdigit, x)))
>>>23

If you need the list of integers only for other purposes or other kind of sum then just leave it in map :

map(int, filter(str.isdigit, x))
>>>[6, 4, 8, 3, 2]

Note: About the string.letters approach. letters is locale dependent so this:

import locale, string
locale.setlocale(locale.LC_ALL, 'es_ES') # or 'esp_esp' if you're on Windows
string.letters
>>> "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzŠŚŽšśžźŞµşŔÁÂĂÄĹĆÇČÉĘËĚÍÎĎĐŃŇÓÔŐÖŘŮÚŰÜÝŢßŕáâăäĺćçčéęëěíîďđńňóôőöřůúűüýţ˙"

Although I would recommend regex for this case as suggested above :)

Nice to collaborate :D

While it's certainly true you can filter out letters in various ways, it's probably more Pythonic to let the interpreter decides what can be interpreted as a digit and what cannot. So even though it's not a one-liner, you may prefer this approach:

aninput = "648392A0&sle4"
def discard_non_ints(itbl, rdx=10):
  for d in itbl:
    try:
      yield(int(d, rdx))
    except ValueError:
      pass

sum(discard_non_ints(aninput))
36

What's particularly nice about this approach is it gives you the flexibility to include non-decimal digits. Want to sum all the hexidecimal digits?

sum(discard_non_ints('deadbeforenoon', 16))
104

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM