简体   繁体   中英

Last 12 Hours Files Using Regex

I am trying to get all files for last 12 hours, file names has following format %Y-%m-%d %H

Here is my python script, i am trying to make work

last12HourDateTime = datetime.today() - timedelta(hours = 12)
allowedFormat = last12HourDateTime.strftime('%Y-%m-%d %H')


for filePath in glob.glob(allowedFormat):

I know there are dozens ways doing it, yet i would like to know if its possible this way

(EDIT) I was able to complete it by

allowedFormats =[]
for i in range (1,12):
    last12HourDateTime = datetime.today() - timedelta(hours = (i - 1))
    allowedFormats.append(last12HourDateTime.strftime('%Y-%m-%d-%H.log'))


for allowedFormat in allowedFormats:
    for filePath in glob.glob(allowedFormat):

Yet still looking for better efficient solution

Technically: yes.

Worth it: No.

Reason: Regex has no understanding of numeric value and thus can't do arithmetical comparisons (x > z - 12).

In other words: You'd have to generate a custom regex for every single use and would thus be way better off doing it with a real date format parser and a date class that's capable of what you're after, as in regex you'd have to produce huge amounts of AND ed (...|...) groups and pretty much end up with basic batched string comparisons (which would rechnically still be valid regexes, yet lack any of its higher purposes).


Most text related questions of the pattern "Is x possible in regex?" can technically be answered with YES . (see above)

Thus I'd much rather ask: "Should I (try to) do x with regex?" or "Is regex the right tool for x?" .

If your only tool is a hammer…


If you wanted to at least narrow down the list of potential matches (before doing any real date arithmetic) you'd have to generate a regex based on these rules (from the top of my head, no guarantees)

(I will use h for current hour, d for current day, m for current month and y for current year.)

if (h < 12)
    %dh = '(?:yesterday (?:1[2-9]|2[0-3])|today [0-9]{1,2})'
else
    %dh = '(?:tomorrow (?:[0-9]|1[0-1])|today [0-9]{1,2})'

if (d == 1)
    %m = '(?:lastmonth|thismonth)'
else if (d == 31 && count of days in m == 31 ||
         d == 30 && count of days in m == 30 ||
         m == 2 && d == 28 ||
         m == 2 && d == 29 && y is leap year)
    %m = '(?:thismonth|nextmonth)'
else
    %m = 'thismonth'

if (m == 1)
    %y = '(lastyear|thisyear)'
else if (m == 12)
    %y = '(?:thisyear|nextyear)'
else
    %y = 'thisyear'

where you'd replace yesterday , thisyear , etc with their respective numeric values.

And form a regex of pattern %y-%m-%dh where you'd replace %y , %m , %dh with the values as determined above.

Again: date arithmetic is tricky, thus my algo above probably contains mistakes.


I don't know the wider context of your question so I can only guess. Based on the information you gave (and assuming that the file names/files don't change by 100% for every search, thus allowing some degree of caching), I'd probably go with something like this:

Enumerate your list of files and convert their date-formatted filenames into UNIX timestamps, adding each of them them to a list (possibly better: create container objects holding timestamp & filepath, otherwise you'd have to geceive the filepath by converting the timestamp back into a date formatted string and require the hierarchy to be flat). Sort the list. Get range of matching files using a modified binary search (in which instead of searching for an actual value match you search for a range of relative matches. I have no sample code right now, but it's not that difficult).

Now assuming that from time to time files get added/deleted you'd either have to be able to monitor those system events and update your list.

Creating the list the first time takes O(n) (+ O(nlogn) for the sorting) but if you're able to smartly update your cached timestamp list you should be able to gain quite some performance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM