How to extract certain length of numbers from a string in python? [duplicate]

Question

I have a dataframe which looks like this:

description     
1906 RES 330 ML
1906 RES 330ML
RES 335 c/6
RES 332 c/12

I want to extract the three consecutive digits of numbers and save it in a new column 'volume'. My code is like this:

df['volume'] = df['description'].str.extract('([([\d]*[\d]){3,3}?])')

EXPECTED RESULTS SHOULD BE LIKE THIS:

volume
330
330
335
332

However, it gives the results like this:

volume
1906
1906
335
332

Can anyone help me fix this code? Thanks so much!!!

Answer 1

Might be overkill, but if you want to make sure you don't capture numbers that are part of 4 digit numbers, you might use this:

df['volume'] = df.description.str.extract(r'(?<!\d)(\d{3})(?!\d)', expand=False)    
print(df)

       description volume
0  1906 RES 330 ML    330
1   1906 RES 330ML    330
2      RES 335 c/6    335
3     RES 332 c/12    332

Specify expand=False , so that matches are returned as one pd.Series only.

The regex:

(?<!\\d) - specifies that anything before a set of 3 digits is something that is not a digit
(\\d{3}) - matches 3 digits
(?!\\d) - specifies that anything after a set of 3 digits is something that is not a digit

Answer 2

You need to

not match any number of digits, three times, so delete the [\\d]*
not match 3 digits within anything looking like a "word",
especially not other digits, so use word boundary \\b
not allow optional ?
not overdo the character set thing []

You do not need to:

use two capture groups ()

This regex will find exactly three digits, alone:

\b(\d{3})\b

Answer 3

The regex you are looking for is \\b[\\d]{3}\\b

for more information on \\b see docs

How to extract certain length of numbers from a string in python? [duplicate]

Question

3 answers

solution1
5 ACCPTED 2017-08-28 18:22:37

solution2
2 2017-08-28 18:32:18

solution3
0 2017-08-28 20:16:18

How to extract certain length of numbers from a string in python? [duplicate]

Question

3 answers

solution1 5 ACCPTED 2017-08-28 18:22:37

solution2 2 2017-08-28 18:32:18

solution3 0 2017-08-28 20:16:18

solution1
5 ACCPTED 2017-08-28 18:22:37

solution2
2 2017-08-28 18:32:18

solution3
0 2017-08-28 20:16:18