Extracting number from unicode string with regex

Question

I have the following dictionary which contains some product data:

dictionary = {'price': [u'3\xa0590 EUR'],
              'name': [u'Product name with unicode chars]}

All values are in unicode. As you can see I'm using lists as dictionary values because sometimes I need to concatenate the information from several different sources.

I'm looking for a way to extract the digits from the price value without the non-breaking space (\\xa0) and currency at the end (EUR) by using a regex.

In this case I would like to see the following as a result:

3590

Can you please suggest a solution?

[SOLUTION]

Adding the solution here because the comments field wrapped my code unexpectedly:

I used .sub() method from Python's re module which is a replace function. Here is the final code that gives me the expected result:

p = re.compile( '(\xa0| EUR|)')
result = p.sub( '', dictionary['price'][0])

Answer 1

Not sure about python, but here's a regex:

p = /\D/g;
s.replace(p, '');

Extracting number from unicode string with regex

Question

1 answers

solution1
2 ACCPTED 2014-01-01 22:28:18

Extracting number from unicode string with regex

Question

1 answers

solution1 2 ACCPTED 2014-01-01 22:28:18

solution1
2 ACCPTED 2014-01-01 22:28:18