简体   繁体   English

跨平台拆分python中的路径

[英]cross-platform splitting of path in python

I'd like something that has the same effect as this: 我想要一些效果与此相同的东西:

>>> path = "/foo/bar/baz/file"
>>> path_split = path.rsplit('/')[1:]
>>> path_split
['foo', 'bar', 'baz', 'file']

But that will work with Windows paths too. 但这也适用于Windows路径。 I know that there is an os.path.split() but that doesn't do what I want, and I didn't see anything that does. 我知道有一个os.path.split()但这不能做我想要的,我没有看到任何事情。

Python 3.4 introduced a new module pathlib . Python 3.4引入了一个新的模块pathlib pathlib.Path provides file system related methods, while pathlib.PurePath operates completely independent of the file system: pathlib.Path提供与文件系统相关的方法,而pathlib.PurePath完全独立于文件系统运行:

>>> from pathlib import PurePath
>>> path = "/foo/bar/baz/file"
>>> path_split = PurePath(path).parts
>>> path_split
('\\', 'foo', 'bar', 'baz', 'file')

You can use PosixPath and WindowsPath explicitly when desired: 您可以在需要时显式使用PosixPath和WindowsPath:

>>> from pathlib import PureWindowsPath, PurePosixPath
>>> PureWindowsPath(path).parts
('\\', 'foo', 'bar', 'baz', 'file')
>>> PurePosixPath(path).parts
('/', 'foo', 'bar', 'baz', 'file')

And of course, it works with Windows paths as well: 当然,它也适用于Windows路径:

>>> wpath = r"C:\foo\bar\baz\file"
>>> PurePath(wpath).parts
('C:\\', 'foo', 'bar', 'baz', 'file')
>>> PureWindowsPath(wpath).parts
('C:\\', 'foo', 'bar', 'baz', 'file')
>>> PurePosixPath(wpath).parts
('C:\\foo\\bar\\baz\\file',)
>>>
>>> wpath = r"C:\foo/bar/baz/file"
>>> PurePath(wpath).parts
('C:\\', 'foo', 'bar', 'baz', 'file')
>>> PureWindowsPath(wpath).parts
('C:\\', 'foo', 'bar', 'baz', 'file')
>>> PurePosixPath(wpath).parts
('C:\\foo', 'bar', 'baz', 'file')

Huzzah for Python devs constantly improving the language! Huzzah for Python开发人员不断改进语言!

The OP specified "will work with Windows paths too". OP指定“也适用于Windows路径”。 There are a few wrinkles with Windows paths. Windows路径有一些皱纹。

Firstly, Windows has the concept of multiple drives, each with its own current working directory, and 'c:foo' and 'c:\\\\foo' are often not the same. 首先,Windows具有多个驱动器的概念,每个驱动器都有自己的当前工作目录,而'c:foo''c:\\\\foo'通常不一样。 Consequently it is a very good idea to separate out any drive designator first, using os.path.splitdrive(). 因此,首先使用os.path.splitdrive()分离出任何驱动器指示符是一个非常好的主意。 Then reassembling the path (if required) can be done correctly by drive + os.path.join(*other_pieces) 然后可以通过drive + os.path.join(*other_pieces)正确地重新组装路径(如果需要drive + os.path.join(*other_pieces)

Secondly, Windows paths can contain slashes or backslashes or a mixture. 其次,Windows路径可以包含斜杠或反斜杠或混合。 Consequently, using os.sep when parsing an unnormalised path is not useful. 因此,在解析非规范化路径时使用os.sep是没有用的。

More generally: 更普遍:

The results produced for 'foo' and 'foo/' should not be identical. 'foo''foo/'生成的结果不应该相同。

The loop termination condition seems to be best expressed as "os.path.split() treated its input as unsplittable". 循环终止条件似乎最好表示为“os.path.split()将其输入视为不可分割”。

Here's a suggested solution, with tests, including a comparison with @Spacedman's solution 这是一个建议的解决方案,包括测试,包括与@Spacedman解决方案的比较

import os.path

def os_path_split_asunder(path, debug=False):
    parts = []
    while True:
        newpath, tail = os.path.split(path)
        if debug: print repr(path), (newpath, tail)
        if newpath == path:
            assert not tail
            if path: parts.append(path)
            break
        parts.append(tail)
        path = newpath
    parts.reverse()
    return parts

def spacedman_parts(path):
    components = [] 
    while True:
        (path,tail) = os.path.split(path)
        if not tail:
            return components
        components.insert(0,tail)

if __name__ == "__main__":
    tests = [
        '',
        'foo',
        'foo/',
        'foo\\',
        '/foo',
        '\\foo',
        'foo/bar',
        '/',
        'c:',
        'c:/',
        'c:foo',
        'c:/foo',
        'c:/users/john/foo.txt',
        '/users/john/foo.txt',
        'foo/bar/baz/loop',
        'foo/bar/baz/',
        '//hostname/foo/bar.txt',
        ]
    for i, test in enumerate(tests):
        print "\nTest %d: %r" % (i, test)
        drive, path = os.path.splitdrive(test)
        print 'drive, path', repr(drive), repr(path)
        a = os_path_split_asunder(path)
        b = spacedman_parts(path)
        print "a ... %r" % a
        print "b ... %r" % b
        print a == b

and here's the output (Python 2.7.1, Windows 7 Pro): 这是输出(Python 2.7.1,Windows 7 Pro):

Test 0: ''
drive, path '' ''
a ... []
b ... []
True

Test 1: 'foo'
drive, path '' 'foo'
a ... ['foo']
b ... ['foo']
True

Test 2: 'foo/'
drive, path '' 'foo/'
a ... ['foo', '']
b ... []
False

Test 3: 'foo\\'
drive, path '' 'foo\\'
a ... ['foo', '']
b ... []
False

Test 4: '/foo'
drive, path '' '/foo'
a ... ['/', 'foo']
b ... ['foo']
False

Test 5: '\\foo'
drive, path '' '\\foo'
a ... ['\\', 'foo']
b ... ['foo']
False

Test 6: 'foo/bar'
drive, path '' 'foo/bar'
a ... ['foo', 'bar']
b ... ['foo', 'bar']
True

Test 7: '/'
drive, path '' '/'
a ... ['/']
b ... []
False

Test 8: 'c:'
drive, path 'c:' ''
a ... []
b ... []
True

Test 9: 'c:/'
drive, path 'c:' '/'
a ... ['/']
b ... []
False

Test 10: 'c:foo'
drive, path 'c:' 'foo'
a ... ['foo']
b ... ['foo']
True

Test 11: 'c:/foo'
drive, path 'c:' '/foo'
a ... ['/', 'foo']
b ... ['foo']
False

Test 12: 'c:/users/john/foo.txt'
drive, path 'c:' '/users/john/foo.txt'
a ... ['/', 'users', 'john', 'foo.txt']
b ... ['users', 'john', 'foo.txt']
False

Test 13: '/users/john/foo.txt'
drive, path '' '/users/john/foo.txt'
a ... ['/', 'users', 'john', 'foo.txt']
b ... ['users', 'john', 'foo.txt']
False

Test 14: 'foo/bar/baz/loop'
drive, path '' 'foo/bar/baz/loop'
a ... ['foo', 'bar', 'baz', 'loop']
b ... ['foo', 'bar', 'baz', 'loop']
True

Test 15: 'foo/bar/baz/'
drive, path '' 'foo/bar/baz/'
a ... ['foo', 'bar', 'baz', '']
b ... []
False

Test 16: '//hostname/foo/bar.txt'
drive, path '' '//hostname/foo/bar.txt'
a ... ['//', 'hostname', 'foo', 'bar.txt']
b ... ['hostname', 'foo', 'bar.txt']
False

Someone said "use os.path.split ". 有人说“使用os.path.split ”。 This got deleted unfortunately, but it is the right answer. 不幸的是,这被删除了,但这是正确的答案。

os.path.split(path) os.path.split这样的(路径)

Split the pathname path into a pair, (head, tail) where tail is the last pathname component and head is everything leading up to that. 将路径名路径拆分为一对(头部,尾部),其中tail是最后一个路径名组件,head是指向该路径的所有内容。 The tail part will never contain a slash; 尾部永远不会有斜线; if path ends in a slash, tail will be empty. 如果path以斜线结尾,则tail将为空。 If there is no slash in path, head will be empty. 如果路径中没有斜杠,则head将为空。 If path is empty, both head and tail are empty. 如果path为空,则head和tail都为空。 Trailing slashes are stripped from head unless it is the root (one or more slashes only). 除非是根(仅限一个或多个斜杠),否则会从头部删除尾部斜杠。 In all cases, join(head, tail) returns a path to the same location as path (but the strings may differ). 在所有情况下,join(head,tail)返回与path相同位置的路径(但字符串可能不同)。

So it's not just splitting the dirname and filename. 所以它不只是拆分目录名和文件名。 You can apply it several times to get the full path in a portable and correct way. 您可以多次应用它以便携式和正确的方式获得完整路径。 Code sample: 代码示例:

dirname = path
path_split = []
while True:
    dirname, leaf = split(dirname)
    if leaf:
        path_split = [leaf] + path_split #Adds one element, at the beginning of the list
    else:
        #Uncomment the following line to have also the drive, in the format "Z:\"
        #path_split = [dirname] + path_split 
        break

Please credit the original author if that answer gets undeleted. 如果答案取消删除,请将原作者归功于原创作者。

Use the functionality provided in os.path , eg 使用os.path提供的功能,例如

os.path.split(path)

Like written elsewhere you can call it multiple times to split longer paths. 就像在别处写的一样,你可以多次调用它来分割更长的路径。

Here's an explicit implementation of the approach that just iteratively uses os.path.split ; 这是迭代使用os.path.split的方法的显式实现; uses a slightly different loop termination condition than the accepted answer. 使用与接受的答案略有不同的循环终止条件。

def splitpath(path):
    parts=[]
    (path, tail)=os.path.split( path)
    while path and tail:
         parts.append( tail)
         (path,tail)=os.path.split(path)
    parts.append( os.path.join(path,tail) )
    return map( os.path.normpath, parts)[::-1]

This should satisfy os.path.join( *splitpath(path) ) is path in the sense that they both indicate the same file/directory. 这应该满足os.path.join( *splitpath(path) )path ,因为它们都指示相同的文件/目录。

Tested in linux: 在linux中测试:

In [51]: current='/home/dave/src/python'

In [52]: splitpath(current)
Out[52]: ['/', 'home', 'dave', 'src', 'python'] 

In [53]: splitpath(current[1:])
Out[53]: ['.', 'dave', 'src', 'python']

In [54]: splitpath( os.path.join(current, 'module.py'))
Out[54]: ['/', 'home', 'dave', 'src', 'python', 'module.py']

In [55]: splitpath( os.path.join(current[1:], 'module.py'))
Out[55]: ['.', 'dave', 'src', 'python', 'module.py']

I hand checked a few of the DOS paths, using the by replacing os.path with ntpath module, look OK to me, but I'm not too familiar with the ins and outs of DOS paths. 我手动检查了一些DOS路径,使用通过用ntpath模块替换os.path ,看起来没关系,但我不太熟悉DOS路径的来龙去脉。

Use the functionality provided in os.path, eg 使用os.path中提供的功能,例如

os.path.split(path)

(This answer was by someone else and was mysteriously and incorrectly deleted, since it's a working answer; if you want to split each part of the path apart, you can call it multiple times, and each call will pull a component off of the end.) (这个答案是由其他人进行的,并且是神秘且错误地删除的,因为它是一个有效的答案;如果你想将路径的每个部分分开,你可以多次调用它,每次调用都会将一个组件拉到最后。)

One more try with maxplit option, which is a replacement for os.path.split() 再试一次使用maxplit选项,它是os.path.split()的替代品

def pathsplit(pathstr, maxsplit=1):
    """split relative path into list"""
    path = [pathstr]
    while True:
        oldpath = path[:]
        path[:1] = list(os.path.split(path[0]))
        if path[0] == '':
            path = path[1:]
        elif path[1] == '':
            path = path[:1] + path[2:]
        if path == oldpath:
            return path
        if maxsplit is not None and len(path) > maxsplit:
            return path

So keep using os.path.split until you get to what you want. 所以继续使用os.path.split,直到达到你想要的效果。 Here's an ugly implementation using an infinite loop: 这是一个使用无限循环的丑陋实现:

import os.path
def parts(path):
    components = [] 
    while True:
        (path,tail) = os.path.split(path)
        if tail == "":
            components.reverse()
            return components
        components.append(tail)

Stick that in parts.py, import parts, and voila: 坚持在parts.py,导入部分和瞧:

>>> parts.parts("foo/bar/baz/loop")
['foo', 'bar', 'baz', 'loop']

Probably a nicer implementation using generators or recursion out there... 可能是使用生成器或递归的更好的实现...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM