简体   繁体   English

如何从 Python 中的路径获取不带扩展名的文件名?

[英]How do I get the filename without the extension from a path in Python?

How do I get the filename without the extension from a path in Python?如何从 Python 中的路径获取不带扩展名的文件名?

"/path/to/some/file.txt"  →  "file"

Getting the name of the file without the extension:获取不带扩展名的文件名:

import os
print(os.path.splitext("/path/to/some/file.txt")[0])

Prints:印刷:

/path/to/some/file

Documentation for os.path.splitext . os.path.splitext的文档

Important Note: If the filename has multiple dots, only the extension after the last one is removed.重要提示:如果文件名有多个点,则仅删除最后一个后的扩展名。 For example:例如:

import os
print(os.path.splitext("/path/to/some/file.txt.zip.asc")[0])

Prints:印刷:

/path/to/some/file.txt.zip

See other answers below if you need to handle that case.如果您需要处理这种情况,请参阅下面的其他答案。

Use .stem from pathlib in Python 3.4+在 Python .stem中使用来自pathlib的 .stem

from pathlib import Path

Path('/root/dir/sub/file.ext').stem

will return将返回

'file'

Note that if your file has multiple extensions .stem will only remove the last extension.请注意,如果您的文件有多个扩展名, .stem只会删除最后一个扩展名。 For example, Path('file.tar.gz').stem will return 'file.tar' .例如, Path('file.tar.gz').stem将返回'file.tar'

You can make your own with:您可以自己制作:

>>> import os
>>> base=os.path.basename('/root/dir/sub/file.ext')
>>> base
'file.ext'
>>> os.path.splitext(base)
('file', '.ext')
>>> os.path.splitext(base)[0]
'file'

Important note: If there is more than one .重要提示:如果有多个. in the filename, only the last one is removed.在文件名中,仅删除最后一个。 For example:例如:

/root/dir/sub/file.ext.zip -> file.ext

/root/dir/sub/file.ext.tar.gz -> file.ext.tar

See below for other answers that address that.有关解决该问题的其他答案,请参见下文。

>>> print(os.path.splitext(os.path.basename("/path/to/file/hemanth.txt"))[0])
hemanth

In Python 3.4+ you can use the pathlib solution在 Python 3.4+ 中,您可以使用pathlib解决方案

from pathlib import Path

print(Path(your_path).resolve().stem)

https://docs.python.org/3/library/os.path.html https://docs.python.org/3/library/os.path.html

In python 3 pathlib "The pathlib module offers high-level path objects."在 python 3 pathlib 中“pathlib 模块提供高级路径对象。” so,所以,

>>> from pathlib import Path

>>> p = Path("/a/b/c.txt")
>>> p.with_suffix('')
WindowsPath('/a/b/c')
>>> p.stem
'c'

os.path.splitext() won't work if there are multiple dots in the extension.如果扩展中有多个点,os.path.splitext()不起作用。

For example, images.tar.gz例如,images.tar.gz

>>> import os
>>> file_path = '/home/dc/images.tar.gz'
>>> file_name = os.path.basename(file_path)
>>> print os.path.splitext(file_name)[0]
images.tar

You can just find the index of the first dot in the basename and then slice the basename to get just the filename without extension.您可以在基本名称中找到第一个点的索引,然后对基本名称进行切片以仅获取不带扩展名的文件名。

>>> import os
>>> file_path = '/home/dc/images.tar.gz'
>>> file_name = os.path.basename(file_path)
>>> index_of_dot = file_name.index('.')
>>> file_name_without_extension = file_name[:index_of_dot]
>>> print file_name_without_extension
images

If you want to keep the path to the file and just remove the extension如果您想保留文件的路径并删除扩展名

>>> file = '/root/dir/sub.exten/file.data.1.2.dat'
>>> print ('.').join(file.split('.')[:-1])
/root/dir/sub.exten/file.data.1.2

As noted by @IceAdor in a comment to @user2902201's solution, rsplit is the simplest solution robust to multiple periods (by limiting the number of splits to maxsplit of just 1 (from the end of the string)).正如@IceAdor 在对@user2902201 解决方案的评论中所指出的那样, rsplit是对多个周期具有鲁棒性的最简单的解决方案(通过将拆分数量限制为maxsplit仅 1(从字符串的末尾开始))。

Here it is spelt out:这里是这样写的:

file = 'my.report.txt'
print file.rsplit('.', maxsplit=1)[0]

my.report我的报告

Thought I would throw in a variation to the use of theos.path.splitext without the need to use array indexing.以为我会在不需要使用数组索引的情况下对os.path.splitext的使用进行更改。

The function always returns a (root, ext) pair so it is safe to use: function 始终返回(root, ext)对,因此可以安全使用:

root, ext = os.path.splitext(path)

Example:例子:

>>> import os
>>> path = 'my_text_file.txt'
>>> root, ext = os.path.splitext(path)
>>> root
'my_text_file'
>>> ext
'.txt'

But even when I import os, I am not able to call it path.basename.但即使我导入 os,我也无法将其命名为 path.basename。 Is it possible to call it as directly as basename?是否可以像 basename 一样直接调用它?

import os , and then use os.path.basename import os ,然后使用os.path.basename

import ing os doesn't mean you can use os.foo without referring to os . import os并不意味着您可以在不引用os的情况下使用os.foo

import os
filename, file_extension =os.path.splitext(os.path.basename('/d1/d2/example.cs'))
  • filename is 'example'文件名是“示例”

  • file_extension is '.cs'文件扩展名是'.cs'

' '

The other methods don't remove multiple extensions.其他方法不会删除多个扩展名。 Some also have problems with filenames that don't have extensions.有些还存在没有扩展名的文件名问题。 This snippet deals with both instances and works in both Python 2 and 3. It grabs the basename from the path, splits the value on dots, and returns the first one which is the initial part of the filename.此代码段处理这两个实例并在 Python 2 和 3 中工作。它从路径中获取基本名称,将值拆分为点,然后返回第一个,即文件名的初始部分。

import os

def get_filename_without_extension(file_path):
    file_basename = os.path.basename(file_path)
    filename_without_extension = file_basename.split('.')[0]
    return filename_without_extension

Here's a set of examples to run:这是一组要运行的示例:

example_paths = [
    "FileName", 
    "./FileName",
    "../../FileName",
    "FileName.txt", 
    "./FileName.txt.zip.asc",
    "/path/to/some/FileName",
    "/path/to/some/FileName.txt",
    "/path/to/some/FileName.txt.zip.asc"
]

for example_path in example_paths:
    print(get_filename_without_extension(example_path))

In every case, the value printed is:在每种情况下,打印的值都是:

FileName

Answers using Pathlib for Several Scenarios几个场景下使用 Pathlib 的答案

Using Pathlib, it is trivial to get the filename when there is just one extension (or none), but it can be awkward to handle the general case of multiple extensions.使用 Pathlib,当只有一个扩展名(或没有)时获取文件名是微不足道的,但处理多个扩展名的一般情况可能会很尴尬。

Zero or One extension零或一扩展

from pathlib import Path

pth = Path('./thefile.tar')

fn = pth.stem

print(fn)      # thefile


# Explanation:
# the `stem` attribute returns only the base filename, stripping
# any leading path if present, and strips the extension after
# the last `.`, if present.


# Further tests

eg_paths = ['thefile',
            'thefile.tar',
            './thefile',
            './thefile.tar',
            '../../thefile.tar',
            '.././thefile.tar',
            'rel/pa.th/to/thefile',
            '/abs/path/to/thefile.tar']

for p in eg_paths:
    print(Path(p).stem)  # prints thefile every time

Two or fewer extensions两个或更少的分机

from pathlib import Path

pth = Path('./thefile.tar.gz')

fn = pth.with_suffix('').stem

print(fn)      # thefile


# Explanation:
# Using the `.with_suffix('')` trick returns a Path object after
# stripping one extension, and then we can simply use `.stem`.


# Further tests

eg_paths += ['./thefile.tar.gz',
             '/abs/pa.th/to/thefile.tar.gz']

for p in eg_paths:
    print(Path(p).with_suffix('').stem)  # prints thefile every time

Any number of extensions (0, 1, or more)任意数量的扩展(0、1 或更多)

from pathlib import Path

pth = Path('./thefile.tar.gz.bz.7zip')

fn = pth.name
if len(pth.suffixes) > 0:
    s = pth.suffixes[0]
    fn = fn.rsplit(s)[0]

# or, equivalently

fn = pth.name
for s in pth.suffixes:
    fn = fn.rsplit(s)[0]
    break

# or simply run the full loop

fn = pth.name
for _ in pth.suffixes:
    fn = fn.rsplit('.')[0]

# In any case:

print(fn)     # thefile


# Explanation
#
# pth.name     -> 'thefile.tar.gz.bz.7zip'
# pth.suffixes -> ['.tar', '.gz', '.bz', '.7zip']
#
# If there may be more than two extensions, we can test for
# that case with an if statement, or simply attempt the loop
# and break after rsplitting on the first extension instance.
# Alternatively, we may even run the full loop and strip one 
# extension with every pass.


# Further tests

eg_paths += ['./thefile.tar.gz.bz.7zip',
             '/abs/pa.th/to/thefile.tar.gz.bz.7zip']

for p in eg_paths:
    pth = Path(p)
    fn = pth.name
    for s in pth.suffixes:
        fn = fn.rsplit(s)[0]
        break

    print(fn)  # prints thefile every time

Special case in which the first extension is known已知第一个扩展名的特殊情况

For instance, if the extension could be .tar , .tar.gz , .tar.gz.bz , etc;例如,如果扩展名可以是.tar.tar.gz.tar.gz.bz等; you can simply rsplit the known extension and take the first element:您可以简单地rsplit已知扩展名并获取第一个元素:


pth = Path('foo/bar/baz.baz/thefile.tar.gz')

fn = pth.name.rsplit('.tar')[0]

print(fn)      # thefile

A multiple extension aware procedure.一个多扩展感知过程。 Works for str and unicode paths.适用于strunicode路径。 Works in Python 2 and 3.适用于 Python 2 和 3。

import os

def file_base_name(file_name):
    if '.' in file_name:
        separator_index = file_name.index('.')
        base_name = file_name[:separator_index]
        return base_name
    else:
        return file_name

def path_base_name(path):
    file_name = os.path.basename(path)
    return file_base_name(file_name)

Behavior:行为:

>>> path_base_name('file')
'file'
>>> path_base_name(u'file')
u'file'
>>> path_base_name('file.txt')
'file'
>>> path_base_name(u'file.txt')
u'file'
>>> path_base_name('file.tar.gz')
'file'
>>> path_base_name('file.a.b.c.d.e.f.g')
'file'
>>> path_base_name('relative/path/file.ext')
'file'
>>> path_base_name('/absolute/path/file.ext')
'file'
>>> path_base_name('Relative\\Windows\\Path\\file.txt')
'file'
>>> path_base_name('C:\\Absolute\\Windows\\Path\\file.txt')
'file'
>>> path_base_name('/path with spaces/file.ext')
'file'
>>> path_base_name('C:\\Windows Path With Spaces\\file.txt')
'file'
>>> path_base_name('some/path/file name with spaces.tar.gz.zip.rar.7z')
'file name with spaces'

import os

filename = C:\\Users\\Public\\Videos\\Sample Videos\\wildlife.wmv

This returns the filename without the extension (C:\Users\Public\Videos\Sample Videos\wildlife)这将返回不带extension名的filename名 (C:\Users\Public\Videos\Sample Videos\wildlife)

temp = os.path.splitext(filename)[0]  

Now you can get just the filename from the temp with现在您可以从 temp 中获取filename

os.path.basename(temp)   #this returns just the filename (wildlife)

Very very very simpely no other modules !!!非常非常非常简单,没有其他模块!!!

import os
p = r"C:\Users\bilal\Documents\face Recognition python\imgs\northon.jpg"

# Get the filename only from the initial file path.
filename = os.path.basename(p)

# Use splitext() to get filename and extension separately.
(file, ext) = os.path.splitext(filename)

# Print outcome.
print("Filename without extension =", file)
print("Extension =", ext)
import os
path = "a/b/c/abc.txt"
print os.path.splitext(os.path.basename(path))[0]

On Windows system I used drivername prefix as well, like:在 Windows 系统上,我也使用了驱动程序名称前缀,例如:

>>> s = 'c:\\temp\\akarmi.txt'
>>> print(os.path.splitext(s)[0])
c:\temp\akarmi

So because I do not need drive letter or directory name, I use:所以因为我不需要驱动器号或目录名,所以我使用:

>>> print(os.path.splitext(os.path.basename(s))[0])
akarmi

Improving upon @spinup answer:改进@spinup 答案:

fn = pth.name
for s in pth.suffixes:
    fn = fn.rsplit(s)[0]
    break
    
print(fn)      # thefile 

This works for filenames without extension also这也适用于没有扩展名的文件名

I've read the answers, and I notice that there are many good solutions.我已经阅读了答案,我注意到有很多好的解决方案。 So, for those who are looking to get either (name or extension), here goes another solution, using the os module , both methods support files with multiple extensions.因此,对于那些希望获得(名称或扩展名)的人来说,这里有另一个解决方案,使用os 模块,这两种方法都支持具有多个扩展名的文件。

import os

def get_file_name(path):
    if not os.path.isdir(path):
        return os.path.splitext(os.path.basename(path))[0].split(".")[0]


def get_file_extension(path):
    extensions = []
    copy_path = path
    while True:
        copy_path, result = os.path.splitext(copy_path)
        if result != '':
            extensions.append(result)
        else:
            break
    extensions.reverse()
    return "".join(extensions)

Note: this solution on windows does not support file names with the "\" character注意:windows 上的此解决方案不支持带有“\”字符的文件名

We could do some simple split / pop magic as seen here ( https://stackoverflow.com/a/424006/1250044 ), to extract the filename (respecting the windows and POSIX differences).我们可以做一些简单的split / pop魔术,如此处所示( https://stackoverflow.com/a/424006/1250044 ),以提取文件名(尊重 windows 和 POSIX 差异)。

def getFileNameWithoutExtension(path):
  return path.split('\\').pop().split('/').pop().rsplit('.', 1)[0]

getFileNameWithoutExtension('/path/to/file-0.0.1.ext')
# => file-0.0.1

getFileNameWithoutExtension('\\path\\to\\file-0.0.1.ext')
# => file-0.0.1

For convenience, a simple function wrapping the two methods from os.path :为方便起见,一个简单的 function 包装了os.path中的两个方法:

def filename(path):
  """Return file name without extension from path.

  See https://docs.python.org/3/library/os.path.html
  """
  import os.path
  b = os.path.split(path)[1]  # path, *filename*
  f = os.path.splitext(b)[0]  # *file*, ext
  #print(path, b, f)
  return f

Tested with Python 3.5.用 Python 3.5 测试。

import os
list = []
def getFileName( path ):
for file in os.listdir(path):
    #print file
    try:
        base=os.path.basename(file)
        splitbase=os.path.splitext(base)
        ext = os.path.splitext(base)[1]
        if(ext):
            list.append(base)
        else:
            newpath = path+"/"+file
            #print path
            getFileName(newpath)
    except:
        pass
return list

getFileName("/home/weexcel-java3/Desktop/backup")
print list

the easiest way to resolve this is to解决这个问题的最简单方法是

import ntpath 
print('Base name is ',ntpath.basename('/path/to/the/file/'))

this saves you time and computation cost.这可以节省您的时间和计算成本。

I didn't look very hard but I didn't see anyone who used regex for this problem.我看起来不是很努力,但我没有看到有人使用正则表达式来解决这个问题。

I interpreted the question as "given a path, return the basename without the extension."我将问题解释为“给定路径,返回不带扩展名的基本名称”。

eg例如

"path/to/file.json" => "file" "path/to/file.json" => "file"

"path/to/my.file.json" => "my.file" "path/to/my.file.json" => "my.file"

In Python 2.7, where we still live without pathlib ...在 Python 2.7 中,我们仍然没有pathlib ...

def get_file_name_prefix(file_path):
    basename = os.path.basename(file_path)

    file_name_prefix_match = re.compile(r"^(?P<file_name_pre fix>.*)\..*$").match(basename)

    if file_name_prefix_match is None:
        return file_name
    else:
        return file_name_prefix_match.group("file_name_prefix")
get_file_name_prefix("path/to/file.json")
>> file

get_file_name_prefix("path/to/my.file.json")
>> my.file

get_file_name_prefix("path/to/no_extension")
>> no_extension

What about the following?下面的呢?

import pathlib
filename = '/path/to/dir/stem.ext.tar.gz'
pathlib.Path(filename).name[:-len(''.join(pathlib.Path(filename).suffixes))]
# -> 'stem'

or this equivalent?还是这个等价物?

pathlib.Path(filename).name[:-sum(map(len, pathlib.Path(filename).suffixes))]

Using pathlib.Path.stem is the right way to go, but here is an ugly solution that is way more efficient than the pathlib based approach.使用pathlib.Path.stem是 go 的正确方法,但这是一个丑陋的解决方案,它比基于 pathlib 的方法更有效。

You have a filepath whose fields are separated by a forward slash / , slashes cannot be present in filenames, so you split the filepath by / , the last field is the filename.您有一个文件路径,其字段由正斜杠/分隔,斜杠不能出现在文件名中,因此您将文件路径拆分为/ ,最后一个字段是文件名。

The extension is always the last element of the list created by splitting the filename by dot .扩展名始终是通过用点分割文件名创建的列表的最后一个元素. , so if you reverse the filename and split by dot once, the reverse of the second element is the file name without extension. , 所以如果你反转文件名并用点分割一次,第二个元素的反转是没有扩展名的文件名。

name = path.split('/')[-1][::-1].split('.', 1)[1][::-1]

Performance:表现:

Python 3.9.10 (tags/v3.9.10:f2f3f53, Jan 17 2022, 15:14:21) [MSC v.1929 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.28.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from pathlib import Path

In [2]: file = 'D:/ffmpeg/ffmpeg.exe'

In [3]: Path(file).stem
Out[3]: 'ffmpeg'

In [4]: file.split('/')[-1][::-1].split('.', 1)[1][::-1]
Out[4]: 'ffmpeg'

In [5]: %timeit Path(file).stem
6.15 µs ± 433 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [6]: %timeit file.split('/')[-1][::-1].split('.', 1)[1][::-1]
671 ns ± 37.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [7]:

>>>print(os.path.splitext(os.path.basename("/path/to/file/varun.txt"))[0]) varun

Here /path/to/file/varun.txt is the path to file and the output is varun这里/path/to/file/varun.txt是文件的路径,output 是varun

# use pathlib. the below works with compound filetypes and normal ones
source_file = 'spaces.tar.gz.zip.rar.7z'
source_path = pathlib.Path(source_file)
source_path.name.replace(''.join(source_path.suffixes), '')
>>> 'spaces'

despite the many working implementations described above I added this ^ as it uses pathlib only and works for compound filetypes and normal ones尽管上面描述了许多工作实现,但我添加了这个 ^,因为它仅使用 pathlib 并且适用于复合文件类型和普通文件类型

For maximum esoterica, and for a fun oneliner, and to learn a little about itertools:为了获得最大的秘密,为了一个有趣的 oneliner,并了解一些关于 itertools 的知识:

def strip_suffix(filename):
    """
    >>> video.mp4
    video

    >>> video.extra.mp4
    video.extra
    """
    return ''.join((name_dot[0] + name_dot[1] for name_dot in itertools.zip_longest(filename.split('.')[0:-1], '.', fillvalue='.')))[0:-1]

Note: this is just for fun.注意:这只是为了好玩。 Do not use this.不要使用这个。 Use os.path.splitext instead改用os.path.splitext

I think the easiest way is to use.split("/")我认为最简单的方法是使用.split("/")

input= "PATH/TO/FILE/file.txt" file_only = input.split("/")[-1] print(file_only)

>>> file.txt

You can also do this to extract the last folder:您也可以这样做来提取最后一个文件夹:

input= "PATH/TO/FOLDER" folder_only = input.split("/")[-1] print(folder_only)

>>> FOLDER

If you want the penultimate folder, simply change [-1] to [-2].如果您想要倒数第二个文件夹,只需将 [-1] 更改为 [-2]。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM