Correct python Regular expression to create double dict

Question

I have a list of files with names name_x01_y01_000.h5 or name_y01_x01_000.h5

What is the correct regular expression (or other method) to create a list of: file, x_ind, y_ind

So far i have this code:

name = 'S3_FullBrain_Mosaic_'
type = '.h5'

wildc = name + '*' + type
files = glob.glob(wildc)
files = np.asarray(files)

wildre = 'r\"' +name+'x(?P<x_ind>\d+)_y(?P<y_ind>\d+).+\"'
m = re.match(wildre,files)

Answer 1

Since the glob already ensures the correct filename and extension, the regex need only match the indices. re.search allows a partial match. .groupdict creates a dictionary with named groups as keys. The file key can be handled manually.

>>> file = 'S3_FullBrain_Mosaic_x02_y05_abcd.h5'
>>> result = re.search(r'x(?P<x_ind>\d+)_y(?P<y_ind>\d+)', file).groupdict()
>>> result
{'y_ind': '05', 'x_ind': '02'}
>>> result['file'] = file
>>> result
{'y_ind': '05', 'file': 'S3_FullBrain_Mosaic_x02_y05_abcd.h5', 'x_ind': '02'}

You can iterate over the files to produce the list of dicts. For this there's no need to create a numpy array, since I doubt you're going to do any heavy numerical calculations on the files list.

To handle both possible formats you will need to call re.search with two regexes. One will return None , the other a match on which you can use groupdict .

Answer 2

You could use re.findall

import re

names = ['name_x01_y01_000.h5', 'name_y01_x01_000.h5']
for name in names:
    matches = re.findall(r'_([xy])(\d+)(?=_)', name)
    d = {k: int(v) for k, v in matches}
    d['name'] = name

Correct python Regular expression to create double dict

Question

2 answers

solution1
1 2016-04-29 17:17:28

solution2
1 2016-04-29 17:50:39

Correct python Regular expression to create double dict

Question

2 answers

solution1 1 2016-04-29 17:17:28

solution2 1 2016-04-29 17:50:39

solution1
1 2016-04-29 17:17:28

solution2
1 2016-04-29 17:50:39