Getting file names without file extensions with glob

Question

I'm searching for .txt files only

from glob import glob
result = glob('*.txt')

>> result
['text1.txt','text2.txt','text3.txt']

but I'd like result without the file extensions

>> result
['text1','text2','text3']

Is there a regex pattern that I can use with glob to exclude the file extensions from the output, or do I have to use a list comprehension on result ?

Answer 1

There is no way to do that with glob() , You need to take the list given and then create a new one to store the values without the extension:

import os
from glob import glob

[os.path.splitext(val)[0] for val in glob('*.txt')]

os.path.splitext(val) splits the file names into file names and extensions. The [0] just returns the filenames.

Answer 2

Since you're trying to split off a filename extension, not split an arbitrary string, it makes more sense to use os.path.splitext (or the pathlib module). While it's true that the it makes no practical difference on the only platforms that currently matter (Windows and *nix), it's still conceptually clearer what you're doing. (And if you later start using path-like objects instead of strings, it will continue to work unchanged, to boot.)

So:

paths = [os.path.splitext(path)[0] for path in paths]

Meanwhile, if this really offends you for some reason, what glob does under the covers is just calling fnmatch to turn your glob expression into a regular expression and then applying that to all of the filenames. So, you can replace it by just replacing the regex yourself and using capture groups:

rtxt = re.compile(r'(.*?)\.txt')
files = (rtxt.match(file) for file in os.listdir(dirpath))
files = [match.group(1) for match in files if match]

This way, you're not doing a listcomp on top of the one that's already in glob ; you're doing one instead of the one that's already in glob . I'm not sure if that's a useful win or not, but since you seem to be interested in eliminating a listcomp…

Answer 3

使用索引切片：

result = [i[:-4] for i in result]

Answer 4

Another way using rsplit :

>>> result = ['text1.txt','text2.txt.txt','text3.txt']
>>> [x.rsplit('.txt', 1)[0] for x in result]
['text1', 'text2.txt', 'text3']

You could do as a list-comprehension:

result = [x.rsplit(".txt", 1)[0] for x in glob('*.txt')]

Answer 5

这个 glob 只选择没有扩展名的文件： **/*/!(*.*)

Answer 6

Use str.split

>>> result = [r.split('.')[0] for r in glob('*.txt')]
>>> result
['text1', 'text2', 'text3']

Getting file names without file extensions with glob

Question

6 answers

solution1
3 ACCPTED 2018-06-18 16:41:48

solution2
2 2018-06-18 16:46:56

solution3
1 2018-06-18 16:34:31

solution4
1 2018-06-18 16:38:33

solution5
1 2020-01-13 18:11:03

solution6
0 2018-06-18 16:37:17

Getting file names without file extensions with glob

Question

6 answers

solution1 3 ACCPTED 2018-06-18 16:41:48

solution2 2 2018-06-18 16:46:56

solution3 1 2018-06-18 16:34:31

solution4 1 2018-06-18 16:38:33

solution5 1 2020-01-13 18:11:03

solution6 0 2018-06-18 16:37:17

solution1
3 ACCPTED 2018-06-18 16:41:48

solution2
2 2018-06-18 16:46:56

solution3
1 2018-06-18 16:34:31

solution4
1 2018-06-18 16:38:33

solution5
1 2020-01-13 18:11:03

solution6
0 2018-06-18 16:37:17