简体   繁体   中英

How to remove .txt or .docx at end of string in python

I am trying to create a list of all file names from a specific directory. My code is below:

import os
#dir = input('Enter the directory: ')
dir = 'C:/Users/brian/Documents/Moeller'
r = os.listdir(dir)
for fnam in os.listdir(dir):
    print(fnam.split())
    sep = fnam.split()

My output is:

['50', 'OP', '856101P02.txt']
['856101P02', 'OP', '040.txt']
['856101P02', 'OP', '50.txt']
['OP', '040', '856101P02.txt']

How would I be able to remove anything to the right of a "." in a string, while keeping the text to the left of the period?

Basically, what you do is start splitting from the right with rsplit and then instruct it to split only once.

print "abcd".rsplit('.',1)[0]

prints abc

You can use os.path.splitext to split a filename to two parts, keeping only the extension in the right, and everything else on the left. For example, a path like some/path/file.tar.gz will be split to some/path/file.tar and .gz :

base, ext = os.path.splitext('path/to/hello.tar.gz')

If you want to get rid of the . in the ext part, simply use ext[1:] .

If the file has no extension, for example path/to/file , then the ext part will be the empty string. This is a nice feature, so that os.path.splitext always returns a tuple of two elements, and this way the base, ext = ... example above always works.

I am trying to create a list of all file names from a specific directory. [...] How would I be able to remove anything to the right of a "." in a string, while keeping the text to the left of the period?

To get the base names (filenames without the extension) of a specific directory somedir , you could use this list comprehension:

basenames = [os.path.splitext(f)[0] for f in os.listdir(somedir)]

From there, find the period and take everything up to that position. In simple steps ...

for fnam in os.listdir(dir):
    nam_split = fnam.split()   # "sep" is usually the separator character
    print(nam_split)
    ext_split = nam_split.rsplit('.', 1)  # Split at only one dot, from the right
    file_no_ext = ext_split[0]    # The first part of the split is the file name
    print(file_no_ext)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM