python parsing a string

Question

I have a list with strings.

list_of_strings

They look like that:

'/folder1/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file'

I want to part this string into: /folder1/folder2/folder3/folder4/folder5/exp-* and put this into a new list.

I thought to do something like that, but I am lacking the right snippet to do what I want:

list_of_stringparts = []

for string in sorted(list_of_strings):
    part= string.split('/')[7]  # or whatever returns the first part of my string
    list_of_stringparts.append(part)

has anyone an idea? Do I need a regex?

Answer 1

You are using array subscription which extracts one (eigth) element. To get first seven elements, you need a slicing [N:M:S] like this:

>>> l = '/folder1/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file'
>>> l.split('/')[:7]
['', 'folder1', 'folder2', 'folder3', 'folder4', 'folder5', 'exp-*']

In our case N is ommitted (by default 0) and S is step which is by default set to 1, so you'll get elements 0-7 from the result of split .

To construct your string back, use join() :

>>> '/'.join(s)
'/folder1/folder2/folder3/folder4/folder5/exp-*'

Answer 2

I would do like this,

>>> s = '/folder1/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file'
>>> s.split('/')[:7]
['', 'folder1', 'folder2', 'folder3', 'folder4', 'folder5', 'exp-*']
>>> '/'.join(s.split('/')[:7])
'/folder1/folder2/folder3/folder4/folder5/exp-*'

Using re.match

>>> s = '/folder1/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file'
>>> re.match(r'.*?\*', s).group()
'/folder1/folder2/folder3/folder4/folder5/exp-*'

Answer 3

Your example suggests that you want to partition the strings at the first * character. This can be done with str.partition() :

list_of_stringparts = []

list_of_strings = ['/folder1/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file', '/folder1/exp-*/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file', '/folder/blah/pow']
for s in sorted(list_of_strings):
    head, sep, tail = s.partition('*')
    list_of_stringparts.append(head + sep)

>>> list_of_stringparts
['/folder/blah/pow', '/folder1/exp-*', '/folder1/folder2/folder3/folder4/folder5/exp-*']

Or this equivalent list comprehension:

list_of_stringparts = [''.join(s.partition('*')[:2]) for s in sorted(list_of_strings)]

This will retain any string that does not contain a * - not sure from your question if that is desired.

python parsing a string

Question

3 answers

solution1
3 ACCPTED 2015-05-22 12:42:16

solution2
1 2015-05-22 12:41:04

solution3
0 2015-05-22 13:15:30

python parsing a string

Question

3 answers

solution1 3 ACCPTED 2015-05-22 12:42:16

solution2 1 2015-05-22 12:41:04

solution3 0 2015-05-22 13:15:30

solution1
3 ACCPTED 2015-05-22 12:42:16

solution2
1 2015-05-22 12:41:04

solution3
0 2015-05-22 13:15:30