Extract list of words from filenames

Question

I need to get a list of words, that files contains. Here is the files:

sub-Dzh_task-FmriPictures_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
sub-Dzh_task-FmriVernike_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
sub-Dzh_task-FmriWgWords_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
sub-Dzh_task-RestingState_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii

I need to get that goes after task-<>_, so my list should looks:

['FmriPictures','FmriVernike','FmriWgWords','RestingState']

how can I implement it in python3?

Answer 1

Here's a Python Solution for this which uses Regex.

>>> import re
>>> test_str = 'sub-Dzh_task-FmriPictures_space- 
MNI152NLin2009cAsym_desc-preproc_bold_mask- 
Language_sub01_component_ica_s1_.nii'
>>> re.search('task-(.*?)_', test_str).group(1)
'FmriPictures'

I think you can do the same for every string.

Answer 2

l=["sub-Dzh_task-FmriPictures_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii",
"sub-Dzh_task-FmriVernike_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii",
"sub-Dzh_task-FmriWgWords_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii",
"sub-Dzh_task-RestingState_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii"]

k=[]
for i in l:

    k.append(i.split('-')[2].replace("_space",""))
print(k)

thats just approach.

Answer 3

You can loop over your list and use regex to get the names from the strings like this example:

import re

a = ['sub-Dzh_task-FmriPictures_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii',
 'sub-Dzh_task-FmriVernike_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii',
 'sub-Dzh_task-FmriWgWords_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii',
 'sub-Dzh_task-RestingState_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii']

out = []
for elm in a:
    condition = re.search(r'_task-(.*?)_', elm)
    if bool(condition):
        out.append(condition.group(1))

print(out)

Output:

['FmriPictures', 'FmriVernike', 'FmriWgWords', 'RestingState']

Answer 4

I would just simply replace

sub-Dzh_task-

and

_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii

with null. Just empty those lines out and you'll get the file names.

Extract list of words from filenames

Question

4 answers

solution1
2 2019-06-22 17:05:00

solution2
0 2019-06-22 16:54:42

solution3
0 2019-06-22 17:09:51

solution4
-1 2019-06-22 16:44:16

Extract list of words from filenames

Question

4 answers

solution1 2 2019-06-22 17:05:00

solution2 0 2019-06-22 16:54:42

solution3 0 2019-06-22 17:09:51

solution4 -1 2019-06-22 16:44:16

solution1
2 2019-06-22 17:05:00

solution2
0 2019-06-22 16:54:42

solution3
0 2019-06-22 17:09:51

solution4
-1 2019-06-22 16:44:16