如何在 Python 中获取没有日期部分的文件名？

Question

the filename can be any of below examples文件名可以是以下任何示例

abc_dec_2020_06_23.csv
efg_edd_20200623.csv
abc_20200623121935.csv

I need to extract only by excluding the number part我只需要通过排除数字部分来提取

abc_dec_
efg_edd
abc_

I am trying to achieve the archive the previous file present in the SFTP location我正在尝试将 SFTP 位置中存在的上一个文件存档

Below is what I am trying to achieve以下是我想要实现的目标

fileName = self.s3_key.split('/')[-1]
sftp_client.rename( self.sftp_path + fileName,  archive_path + fileName)
 with sftp_client.open(self.sftp_path + fileName, 'wb') as f:
        s3_client.download_fileobj(self.s3_bucket, self.s3_key, f)

Answer 1

With a regular expression:使用正则表达式：

r"^[a-z_]+"

Example:例子：

import re
regex_comp = re.compile(r"^[a-z_]+")
match_str = regex_comp.match("abc_20200623121935.csv")
print(match_str.group())

Result:结果：

abc_

If your filenames have digits:如果您的文件名有数字：

import re
filenames = ["efg_12_edd_20200623.csv", "abc_dec_2020_06_23.csv",
             "efg_edd_20200623.csv", "a1b2c11_20200623121935.csv"]

regex1 = re.compile(r"[0-9]{4}_[0-9]{2}_[0-9]{2}\.csv$")
regex2 = re.compile(r"[0-9]{8,14}\.csv$")

filename = ""
for filename_full in filenames:
    test = regex1.search(filename_full)
    if test is None:
        test = regex2.search(filename_full)
    if test is not None:
        filename = filename_full[:test.span()[0]]
        print(filename)
    else:
        print(filename_full, ": No match")

Result:结果：

efg_12_edd_
abc_dec_
efg_edd_
a1b2c11_

Answer 2

You could try this:你可以试试这个：

file='abc_dec_2020_06_23.csv'
cleanfile=''
for let in file:
    if let.isdigit():
        break
    else:
        cleanfile+=let
  

print(cleanfile)

Output: Output：

'abc_dec_'

And if your filenames have digits, you can try this:如果你的文件名有数字，你可以试试这个：

x='abc_12_dec_2020_06_23.csv'
newval=''
for i,val in enumerate(x.split('_')):
    if i==len(x.split('_'))-1:
        if len(val.replace('.csv',''))<8 and len(list(x.split('_'))[i-1])>2: #e.g. 202006_23.csv'
            newval='_'.join(list(x.split('_'))[:i-1])+'_'
        elif len(val.replace('.csv',''))<8 and len(list(x.split('_'))[i-1])==2: #e.g. 2020_06_23.csv'
            newval='_'.join(list(x.split('_'))[:i-2])+'_'
        elif len(val.replace('.csv',''))<8 and len(val.replace('.csv',''))==4: #e.g. 2020_0623.csv'
            newval='_'.join(list(x.split('_'))[:i-1])+'_'
        else:
            newval='_'.join(list(x.split('_'))[:i])+'_'
print(newval)

Output: Output：

'abc_12_dec_'

如何在 Python 中获取没有日期部分的文件名？

问题描述

2 个解决方案

解决方案1
2 2020-06-25 00:16:53

解决方案2
1 已采纳 2020-06-25 00:05:46

如何在 Python 中获取没有日期部分的文件名？

问题描述

2 个解决方案

解决方案1 2 2020-06-25 00:16:53

解决方案2 1 已采纳 2020-06-25 00:05:46

解决方案1
2 2020-06-25 00:16:53

解决方案2
1 已采纳 2020-06-25 00:05:46