简体   繁体   中英

How to extract files with date pattern using python

I have n-files in a folder like

source_dir

 abc_2017-07-01.tar   
 abc_2017-07-02.tar 
 abc_2017-07-03.tar 
 pqr_2017-07-02.tar

Lets consider for a single pattern now 'abc'

(but I get this pattern randomly from Database, so need double filtering,one for pattern and one for last day)

And I want to extract file of last day ie '2017-07-02'

Here I can get common files but not exact last_day files

Code

pattern = 'abc'
allfiles=os.listdir(source_dir)
m_files=[f for f in allfiles if str(f).startswith(pattern)]
print m_files

output:

  [ 'abc_2017-07-01.tar' ,  'abc_2017-07-02.tar' , 'abc_2017-07-03.tar' ] 

This gives me all files related to abc pattern, but how can filter out only last day file of that pattern

Expected :

 [ 'abc_2017-07-02.tar' ]

Thanks

just a minor tweak in your code can get you the desired result.

import os
from datetime import datetime, timedelta

allfiles=os.listdir(source_dir)
file_date = datetime.now() + timedelta(days=-1)
pattern = 'abc_' +str(file_date.date())
m_files=[f for f in allfiles if str(f).startswith(pattern)]

Hope this helps!

latest = max(m_files, key=lambda x: x[-14:-4])

会在m_files中的文件名中找到具有最新日期的文件名。

use python regex package like :

    import re 
    import os 

    files = os.listdir(source_dir)
    for file in files: 
            match = re.search('abc_2017-07-(\d{2})\.tar', file)
            day = match.group(1)

and then you can work with day in the loop to do what ever you want. Like create that list:

    import re 
    import os 

    def extract_day(name):
        match = re.search('abc_2017-07-(\d{2})\.tar', file)
        day = match.group(1)
        return day 


    files = os.listdir(source_dir)
    days = [extract_day(file) for file in files]

if the month is also variable you can substitute ' 07 ' with ' \\d\\d ' or also ' \\d{2} '. Be carefull if you have files that dont match with the pattern at all, then match.group() will cause an error since match is of type none. Then use :

    def extract_day(name):
        match = re.search('abc_2017-07-(\d{2})\.tar', file)
        try:
            day = match.group(1) 
        except :
            day = None
        return day 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM