简体   繁体   English

如何在 Python 中获取没有日期部分的文件名?

[英]How to get the filename without the date part in Python?

the filename can be any of below examples文件名可以是以下任何示例

abc_dec_2020_06_23.csv
efg_edd_20200623.csv
abc_20200623121935.csv

I need to extract only by excluding the number part我只需要通过排除数字部分来提取

abc_dec_
efg_edd
abc_

I am trying to achieve the archive the previous file present in the SFTP location我正在尝试将 SFTP 位置中存在的上一个文件存档

Below is what I am trying to achieve以下是我想要实现的目标

fileName = self.s3_key.split('/')[-1]
sftp_client.rename( self.sftp_path + fileName,  archive_path + fileName)
 with sftp_client.open(self.sftp_path + fileName, 'wb') as f:
        s3_client.download_fileobj(self.s3_bucket, self.s3_key, f)

With a regular expression:使用正则表达式:

r"^[a-z_]+"

Example:例子:

import re
regex_comp = re.compile(r"^[a-z_]+")
match_str = regex_comp.match("abc_20200623121935.csv")
print(match_str.group())

Result:结果:

abc_

If your filenames have digits:如果您的文件名有数字:

import re
filenames = ["efg_12_edd_20200623.csv", "abc_dec_2020_06_23.csv",
             "efg_edd_20200623.csv", "a1b2c11_20200623121935.csv"]

regex1 = re.compile(r"[0-9]{4}_[0-9]{2}_[0-9]{2}\.csv$")
regex2 = re.compile(r"[0-9]{8,14}\.csv$")

filename = ""
for filename_full in filenames:
    test = regex1.search(filename_full)
    if test is None:
        test = regex2.search(filename_full)
    if test is not None:
        filename = filename_full[:test.span()[0]]
        print(filename)
    else:
        print(filename_full, ": No match")

Result:结果:

efg_12_edd_
abc_dec_
efg_edd_
a1b2c11_

You could try this:你可以试试这个:

file='abc_dec_2020_06_23.csv'
cleanfile=''
for let in file:
    if let.isdigit():
        break
    else:
        cleanfile+=let
  

print(cleanfile)

Output: Output:

'abc_dec_'

And if your filenames have digits, you can try this:如果你的文件名有数字,你可以试试这个:

x='abc_12_dec_2020_06_23.csv'
newval=''
for i,val in enumerate(x.split('_')):
    if i==len(x.split('_'))-1:
        if len(val.replace('.csv',''))<8 and len(list(x.split('_'))[i-1])>2: #e.g. 202006_23.csv'
            newval='_'.join(list(x.split('_'))[:i-1])+'_'
        elif len(val.replace('.csv',''))<8 and len(list(x.split('_'))[i-1])==2: #e.g. 2020_06_23.csv'
            newval='_'.join(list(x.split('_'))[:i-2])+'_'
        elif len(val.replace('.csv',''))<8 and len(val.replace('.csv',''))==4: #e.g. 2020_0623.csv'
            newval='_'.join(list(x.split('_'))[:i-1])+'_'
        else:
            newval='_'.join(list(x.split('_'))[:i])+'_'
print(newval)

Output: Output:

'abc_12_dec_'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM