Given a path, I need to extract the part of the path that precedes a specifically named sub-directory (if it exists) - we'll call this stopper to easily identify it in this question.
It should be noted that the path may begin or end with the stopper
Some sample pairs for input/output:
path = 'some/path/to/my/file.ext'
# ends with stopper
stopper = 'my'
result = 'some/path/to'
# begins with stopper
stopper = 'some'
result = ''
# stopper in middle
stopper = 'to'
result = 'some/path'
# special case - should stop at first stopper location
path = 'path/to/to/my/file.ext'
stopper = 'to'
result = 'path'
I've devised two such methods of obtaining an answer:
import re
# p = path; s = stopper
def regex_method(p,s):
regex = r"(?:(?!(?:^|(?<=/))" + s + r").)+(?=/)"
m = re.match(regex, p)
if m:
return m.group()
return ''
This works but is prone to failure based on the passed stopper value - not ideal for use in production.
import os
# p = path; s = stopper
def os_method(p,s):
parts = os.path.dirname(p).split('/')
return '/'.join(parts[:parts.index(s)])
This works and seems more concise than the regex counterpart, but it seems odd to me that I need to split the string, then the list based on a value's index, then join it together. I feel like this could be simplified or improved.
Another seemingly more efficient and a much simpler method is to use itertools.takewhile
, which (from the docs) makes an iterator that returns elements from the iterable as long as the predicate is true:
import os
from itertools import takewhile
def it_method(p, s):
return '/'.join(takewhile(lambda d : d != s, p.split('/')))
Test:
print(it_method('some/path/to/my/file.ext', 'my'))
print(it_method('some/path/to/my/file.ext', 'to'))
print(it_method('some/path/to/my/file.ext', 'some'))
print(it_method('some/path/to/to/my/file.ext', 'to'))
Output:
some/path/to
some/path
some/path
So in this case it keeps generating directory names until stopper
is encountered.
The predicate could also be shortened to s.__ne__
instead of using a lambda
function:
def it_method(p,s):
return '/'.join(takewhile(s.__ne__, p.split('/')))
I would suggest using pathlib
:
def split_path(path, stopper):
parts = path.parts
idx = next((idx for idx, part in enumerate(parts) if part == stopper))
result = Path(*parts[:idx])
return result
Using your example:
path = Path('some/path/to/my/file.ext'
)
stopper = 'my'
split_path(path, stopper)
Output: PosixPath('some/path/to')
stopper = 'some'
split_path(path, stopper)
Output: PosixPath('.')
stopper = 'to'
split_path(path, stopper)
Output: PosixPath('some/path')
You can use pathlib
module and a next
on generator like so:
from pathlib import Path
# p = path; s = stopper
def get_path(p,s):
return next((parent for parent in Path(p).parents if not any(x in str(parent) for x in (f'/{s}/', f'{s}/', f'/{s}')) and str(parent) != s), '')
path = 'some/path/to/my/file.ext'
# ends with stopper
stopper = 'to'
print(get_path(path, stopper))
# some/path
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.