Removing numbers from strings

Question

So, I am working with a text file on which I am doing the following operations on the string

     def string_operations(string):

        1) lowercase
        2) remove integers from string
        3) remove symbols
        4) stemming

After this, I am still left with strings like:

  durham 28x23

I see the flaw in my approach but would like to know if there is a good, fast way to identify if there is a numeric value attached with the string.

So in the above example, I want the output to be

  durham

Another example:

 21st ammendment

Should give:

ammendment

So how do I deal with this stuff?

Answer 1

If you requirement is, "remove any terms that start with a digit", you could do something like this:

def removeNumerics(s):
  return ' '.join([term for term in s.split() if not term[0].isdigit()])

This splits the string on whitespace and then joins with a space all the terms that do not start with a number.

And it works like this:

>>> removeNumerics('21st amendment')
'amendment'
>>> removeNumerics('durham 28x23')
'durham'

If this isn't what you're looking for, maybe show some explicit examples in your questions (showing both the initial string and your desired result).

Removing numbers from strings

Question

1 answers

solution1
5 ACCPTED 2012-05-04 19:19:58

Removing numbers from strings

Question

1 answers

solution1 5 ACCPTED 2012-05-04 19:19:58

solution1
5 ACCPTED 2012-05-04 19:19:58