how to extract the numbers in txt file in python

Question

I have a txt file like this:

ASP62-Main-N     LYS59-Main-O    100.00%
THR64-Side-OG1   VAL60-Main-O    100.00%
ALA66-Main-N     LEU61-Main-O    100.00%
LYS33-Main-N     SER30-Main-O    100.00%

I want to get the number before "-Main" or "-Side",the result like this:

And I wrote some codes,but the result only show the number berore "-Main".

f1 = open(filename1)
for line in f1.readlines():
    N=re.compile(r'(\d+)-Main|-Side')
    n=N.findall(line)
    print (n)

The result is shown below:

['62', '59']
['', '60']
['66', '61']
['33', '30']

please someone give me some advice.

Answer 1

Or this as full code:

import re
with open('filename.txt','r') as f:
   for i in f:
      print(' '.join(re.findall('\d{2}',i)[:-2]))

Output:

Example

Answer 2

As @JosephSible has mentioned, you should group the patterns in your alternation since alternation has a low precedence, but in this case you should use a non-capturing group for -Main and -Side since you don't actually want them in your output:

N=re.compile(r'(\d+)(?:-Main|-Side)')

Alternatively, you can use a lookahead pattern so you don't need any capturing group:

N=re.compile(r'\d+(?=-Main|-Side)')

Answer 3

It's a precedence issue. Alternation happens late enough that your regex was being parsed as "numbers followed by -Main" or "-Side". Use this regex instead: (\\d+)(-Main|-Side)

how to extract the numbers in txt file in python

Question

3 answers

solution1
2 2018-10-08 03:14:23

solution2
2 ACCPTED 2018-10-08 03:17:06

solution3
1 2018-10-08 03:10:16

how to extract the numbers in txt file in python

Question

3 answers

solution1 2 2018-10-08 03:14:23

solution2 2 ACCPTED 2018-10-08 03:17:06

solution3 1 2018-10-08 03:10:16

solution1
2 2018-10-08 03:14:23

solution2
2 ACCPTED 2018-10-08 03:17:06

solution3
1 2018-10-08 03:10:16