Using Regular expressions to match a portion of the string?(python)

Question

What regular expression can i use to match genes( in bold ) in the gene list string:

GENE_LIST: F59A7.7 ; T25D3.3 ; F13B12.4 ; cysl-1 ; cysl-2 ; cysl-3 ; cysl-4 ; F01D4.8

I tried : GENE_List:((( \\w+).(\\w+)); )+* but it only captures the last gene

Answer 1

Given:

>>> s="GENE_LIST: F59A7.7; T25D3.3; F13B12.4; cysl-1; cysl-2; cysl-3; cysl-4; F01D4.8"

You can use Python string methods to do:

>>> s.split(': ')[1].split('; ')
['F59A7.7', 'T25D3.3', 'F13B12.4', 'cysl-1', 'cysl-2', 'cysl-3', 'cysl-4', 'F01D4.8']

For a regex:

(?<=[:;]\s)([^\s;]+)

Demo

Or, in Python:

>>> re.findall(r'(?<=[:;]\s)([^\s;]+)', s)
['F59A7.7', 'T25D3.3', 'F13B12.4', 'cysl-1', 'cysl-2', 'cysl-3', 'cysl-4', 'F01D4.8']

Answer 2

You can use the following:

\s([^;\s]+)

Demo

The captured group, ([^;\\s]+) , will contain the desired substrings followed by whitespace ( \\s )

>>> s = 'GENE_LIST: F59A7.7; T25D3.3; F13B12.4; cysl-1; cysl-2; cysl-3; cysl-4; F01D4.8'
>>> re.findall(r'\s([^;\s]+)', s)
['F59A7.7', 'T25D3.3', 'F13B12.4', 'cysl-1', 'cysl-2', 'cysl-3', 'cysl-4', 'F01D4.8']

Answer 3

UPDATE

It's in fact much simpler:

[^\s;]+

however, first use substring to take only the part you need (the genes, without GENELIST )

demo: regex demo

Answer 4

string = "GENE_LIST: F59A7.7; T25D3.3; F13B12.4; cysl-1; cysl-2; cysl-3; cysl-4; F01D4.8"
re.findall(r"([^;\s]+)(?:;|$)", string)

The output is:

['F59A7.7',
'T25D3.3',
'F13B12.4',
'cysl-1',
'cysl-2',
'cysl-3',
'cysl-4',
'F01D4.8']

Using Regular expressions to match a portion of the string?(python)

Question

4 answers

solution1
1 ACCPTED 2016-08-11 18:20:34

solution2
1 2016-08-11 18:21:28

solution3
0 2016-08-11 18:17:42

solution4
0 2016-08-13 03:03:05

Using Regular expressions to match a portion of the string?(python)

Question

4 answers

solution1 1 ACCPTED 2016-08-11 18:20:34

solution2 1 2016-08-11 18:21:28

solution3 0 2016-08-11 18:17:42

solution4 0 2016-08-13 03:03:05

solution1
1 ACCPTED 2016-08-11 18:20:34

solution2
1 2016-08-11 18:21:28

solution3
0 2016-08-11 18:17:42

solution4
0 2016-08-13 03:03:05