简体   繁体   中英

How to sort date in python?

I trying to sort the date inside my list, but the dates comes after a string element [EQUIP-X] . First using regex, take the only date and tried to sort. It doesn't work!

I thought to split the string [EQUIP-X] and Date .

files = [filename for root, dirs, files in os.walk(path) for filename in files for date in dateList if filename.endswith(date+".log")]
for item in files:
 reg = re.search(r"(.+]).(\d{2}.\d{2}.\d{4})",item)
 equip = reg.group(1)
 data = reg.group(2)
 namefile = data+'.'+equip 
 print item
  • group(1) - [EQUIP-X]
  • group(2) - Date

Sample String:

[EQUIP-4].02.05.2019.log
[EQUIP-2].01.05.2019.log
[EQUIP-1].30.04.2019.log
[EQUIP-3].29.04.2019.log
[EQUIP-1].01.05.2019.log
[EQUIP-5].30.04.2019.log
[EQUIP-1].29.04.2019.log
[EQUIP-5].30.04.2019.log
[EQUIP-3].30.04.2019.log
[EQUIP-1].29.04.2019.log
[EQUIP-2].02.05.2019.log

Following this tutorial , there is not attribute 'sort' for 'str' object, once I'm not manipulating 'date' but 'str'. What is the better way to do it? The idea was to split and handle with date and after join all

You can just sort based on the end of the string minus the last 4 characters (the file extension) parsed as a date. Since the date format is zero padded, it should always be 10 characters long hence the string splice starting from -14 (10 for date + 4 for extension)

from datetime import datetime

files = ['[EQUIP-4].02.05.2019.log',
'[EQUIP-2].01.05.2019.log',
'[EQUIP-1].30.04.2019.log',
'[EQUIP-3].29.04.2019.log',
'[EQUIP-1].01.05.2019.log',
'[EQUIP-5].30.04.2019.log',
'[EQUIP-1].29.04.2019.log',
'[EQUIP-5].30.04.2019.log',
'[EQUIP-3].30.04.2019.log',
'[EQUIP-1].29.04.2019.log',
'[EQUIP-2].02.05.2019.log']

files.sort(key=lambda x: datetime.strptime(x[-14:-4], '%d.%m.%Y'))
print(files)
['[EQUIP-3].29.04.2019.log',
'[EQUIP-1].29.04.2019.log',
'[EQUIP-1].29.04.2019.log',
'[EQUIP-1].30.04.2019.log',
'[EQUIP-5].30.04.2019.log',
'[EQUIP-5].30.04.2019.log',
'[EQUIP-3].30.04.2019.log',
'[EQUIP-2].01.05.2019.log',
'[EQUIP-1].01.05.2019.log',
'[EQUIP-4].02.05.2019.log',
'[EQUIP-2].02.05.2019.log']

The python sort function has a key parameter that lets you modify an element before sorting it.

This example extracts the number from the end of the string and sorts by it.

a = ['hello 123', 'pumpkin 542', 'muffin 342']

def get_important_part(string):
    return int(string.split()[1])

print(sorted(a, key=get_important_part))

returns

['hello 123', 'muffin 342', 'pumpkin 542']

Why not work with strptime and strftime ?

dates = ['02.05.2019', '20.05.2019', '11.05.2019', '30.05.2019', '08.05.2019', '09.05.2019']
dates_obj = [datetime.strptime(x,'%d.%m.%Y') for x in dates]
dates_sorted = sorted(dates_obj)
dates_sorted = [x.strftime('%d.%m.%Y') for x in dates_sorted]
print (dates_sorted)

['02/05/2019', '08/05/2019', '09/05/2019', '11/05/2019', '20/05/2019', '30/05/2019']

You can convert your list into a panda dataframe then do the sorting accordingly. Sort by year, month and day then convert the index to a list. Then display the sorted values by index (iloc).

import pandas as pd
df = pd.DataFrame([('[EQUIP-4].02.05.2019.log')
,('[EQUIP-2].01.05.2019.log')
,('[EQUIP-1].30.04.2019.log')
,('[EQUIP-3].29.04.2019.log')
,('[EQUIP-1].01.05.2019.log')
,('[EQUIP-5].30.04.2019.log')
,('[EQUIP-1].29.04.2019.log')
,('[EQUIP-5].30.04.2019.log')
,('[EQUIP-3].30.04.2019.log')
,('[EQUIP-1].29.04.2019.log')
,('[EQUIP-2].02.05.2019.log')], columns = ['file'])

df.iloc[df['file'] \
      .map(lambda x: pd.to_datetime(x[-14:-4])) \
      .sort_values() \
      .index \
      .tolist()]

Result:

                 file
1   [EQUIP-2].01.05.2019.log
4   [EQUIP-1].01.05.2019.log
0   [EQUIP-4].02.05.2019.log
10  [EQUIP-2].02.05.2019.log
3   [EQUIP-3].29.04.2019.log
6   [EQUIP-1].29.04.2019.log
9   [EQUIP-1].29.04.2019.log
2   [EQUIP-1].30.04.2019.log
5   [EQUIP-5].30.04.2019.log
7   [EQUIP-5].30.04.2019.log
8   [EQUIP-3].30.04.2019.log

Combining @ddg's and @Sayse's suggestion, you can try:

import re
from datetime import datetime

files = ["[EQUIP-4].02.05.2019.log", ...]

files.sort(key = lambda item: datetime.strptime(re.search(r"(?=.)(\d{2}.\d{2}.\d{4})(?=.)", item).group(0), '%d.%m.%Y'), reverse=False)

or in a more readable way:

def getSortValue(item):
  reg = re.search(r"(?=.)(\d{2}.\d{2}.\d{4})(?=.)", item)
  data = reg.group(0)
  return datetime.strptime(data, '%d.%m.%Y')

files.sort(key = getSortValue, reverse = False)

Output:

print('\n'.join(files))

[EQUIP-3].29.04.2019.log
[EQUIP-1].29.04.2019.log
[EQUIP-1].29.04.2019.log
[EQUIP-1].30.04.2019.log
[EQUIP-5].30.04.2019.log
[EQUIP-5].30.04.2019.log
[EQUIP-3].30.04.2019.log
[EQUIP-2].01.05.2019.log
[EQUIP-1].01.05.2019.log
[EQUIP-4].02.05.2019.log
[EQUIP-2].02.05.2019.log

You can sort the filenames by using the built-in list sort() function, like this:

from datetime import datetime
import os  # Even though not used in example code.
from pprint import pprint
import re

#files = [filename for root, dirs, files in os.walk(path) for filename in files for date in dateList if filename.endswith(date+".log")]
files = [
    '[EQUIP-4].02.05.2019.log',
    '[EQUIP-2].01.05.2019.log',
    '[EQUIP-1].30.04.2019.log',
    '[EQUIP-3].29.04.2019.log',
    '[EQUIP-1].01.05.2019.log',
    '[EQUIP-5].30.04.2019.log',
    '[EQUIP-1].29.04.2019.log',
    '[EQUIP-5].30.04.2019.log',
    '[EQUIP-3].30.04.2019.log',
    '[EQUIP-1].29.04.2019.log',
    '[EQUIP-2].02.05.2019.log',
]

def get_date(filename):
    match = re.search(r".+].(\d{2}.\d{2}.\d{4})",filename)
    date_str = match.group(1)
    return datetime.strptime(date_str, '%d.%m.%Y')

files.sort(key=get_date)

pprint(files)

Output:

['[EQUIP-3].29.04.2019.log',
 '[EQUIP-1].29.04.2019.log',
 '[EQUIP-1].29.04.2019.log',
 '[EQUIP-1].30.04.2019.log',
 '[EQUIP-5].30.04.2019.log',
 '[EQUIP-5].30.04.2019.log',
 '[EQUIP-3].30.04.2019.log',
 '[EQUIP-2].01.05.2019.log',
 '[EQUIP-1].01.05.2019.log',
 '[EQUIP-4].02.05.2019.log',
 '[EQUIP-2].02.05.2019.log']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM