简体   繁体   中英

How to get the path up to a specific sub-directory?

What I'm trying to achieve

Given a path, I need to extract the part of the path that precedes a specifically named sub-directory (if it exists) - we'll call this stopper to easily identify it in this question.

It should be noted that the path may begin or end with the stopper

Some sample pairs for input/output:

path = 'some/path/to/my/file.ext'

# ends with stopper
stopper = 'my'
result = 'some/path/to'

# begins with stopper
stopper = 'some'
result = ''

# stopper in middle
stopper = 'to'
result = 'some/path'
# special case - should stop at first stopper location
path = 'path/to/to/my/file.ext'
stopper = 'to'
result = 'path'

What I have so far

I've devised two such methods of obtaining an answer:

Regex

import re

# p = path; s = stopper
def regex_method(p,s):
  regex = r"(?:(?!(?:^|(?<=/))" + s + r").)+(?=/)"
  m = re.match(regex, p)
  if m:
    return m.group()
  return ''

This works but is prone to failure based on the passed stopper value - not ideal for use in production.

OS

import os

# p = path; s = stopper
def os_method(p,s):
  parts = os.path.dirname(p).split('/')
  return '/'.join(parts[:parts.index(s)])

This works and seems more concise than the regex counterpart, but it seems odd to me that I need to split the string, then the list based on a value's index, then join it together. I feel like this could be simplified or improved.


My questions

  1. Is there a more idiomatic way of splitting a path on a specific directory name?
  2. Is there a simple way of achieving this using list comprehensions?

Another seemingly more efficient and a much simpler method is to use itertools.takewhile , which (from the docs) makes an iterator that returns elements from the iterable as long as the predicate is true:

import os
from itertools import takewhile

def it_method(p, s):
  return '/'.join(takewhile(lambda d : d != s, p.split('/')))

Test:

print(it_method('some/path/to/my/file.ext', 'my'))
print(it_method('some/path/to/my/file.ext', 'to'))
print(it_method('some/path/to/my/file.ext', 'some'))
print(it_method('some/path/to/to/my/file.ext', 'to'))

Output:

some/path/to
some/path

some/path

So in this case it keeps generating directory names until stopper is encountered.

The predicate could also be shortened to s.__ne__ instead of using a lambda function:

def it_method(p,s):
  return '/'.join(takewhile(s.__ne__, p.split('/')))

I would suggest using pathlib :

def split_path(path, stopper):
    parts = path.parts
    idx = next((idx for idx, part in enumerate(parts) if part == stopper))
    result = Path(*parts[:idx])
    return result

Using your example:

path = Path('some/path/to/my/file.ext' )

stopper = 'my'
split_path(path, stopper)

Output: PosixPath('some/path/to')

stopper = 'some'
split_path(path, stopper)

Output: PosixPath('.')

stopper = 'to'
split_path(path, stopper)

Output: PosixPath('some/path')

You can use pathlib module and a next on generator like so:

from pathlib import Path

# p = path; s = stopper
def get_path(p,s):
    return next((parent for parent in Path(p).parents if not any(x in str(parent) for x in (f'/{s}/', f'{s}/', f'/{s}')) and str(parent) != s), '')

path = 'some/path/to/my/file.ext'
# ends with stopper
stopper = 'to' 

print(get_path(path, stopper))
# some/path

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM