Regex pattern within text

Question

I have a long string of data which looks like:

dstgfsda12345.123gsrsvrvsdfcsd23456.234tsrsd

Notice that the '12345.123' pattern is the same. I want to split the string on it using python (so s.split(<regex>) ).

What would be the appropriate regex?

'[0-9]{5}.[0-9]{3}'

does not work; I presume it expects whitespace around it(?).

Answer 1

Just escape . , and you are done:

\d{5}\.\d{3}

You can use Regex token \\d as a shorthand for [0-9] .

Example:

>>> re.split(r'\d{5}\.\d{3}', 'dstgfsda12345.123gsrsvrvsdfcsd23456.234tsrsd')
['dstgfsda', 'gsrsvrvsdfcsd', 'tsrsd']

Answer 2

I don't understand exactly what's your actual need but seems that you want your regex to isolate each occurrence of 5 digits, dot, 3 digits.

So instead of '[0-9]{5}.[0-9]{3}' you must use '[0-9]{5}\\.[0-9]{3}' , because . matches any character, while \\. matches only a dot.

Answer 3

Your regex should be '\\d{5}\\.\\d{3}' .

Check the usage of . instead of \\. . That is because, '.' (Dot.) in the default mode, matches any character except a newline. Refer regex document. Whereas \\s means dot in your string.

For example:

import re
my_string = 'dstgfsda12345.123gsrsvrvsdfcsd23456.234tsrsd'
my_regex = '\d{5}\.\d{3}'
re.split(my_regex, my_string)
# returns: ['dstgfsda', 'gsrsvrvsdfcsd', 'tsrsd']

Explanation on how '\\d{5}\\.\\d{3}' works:

\\d means any digit between 0-9 . \\d{5} sub-string with any 5 consecutive digits. \\. means digits followed by single . . At last \\d{3} means any 3 digits after .

Regex pattern within text

Question

3 answers

solution1
4 ACCPTED 2016-10-02 13:10:12

solution2
1 2016-10-02 13:12:17

solution3
1 2016-10-02 13:15:02

Regex pattern within text

Question

3 answers

solution1 4 ACCPTED 2016-10-02 13:10:12

solution2 1 2016-10-02 13:12:17

solution3 1 2016-10-02 13:15:02

solution1
4 ACCPTED 2016-10-02 13:10:12

solution2
1 2016-10-02 13:12:17

solution3
1 2016-10-02 13:15:02