简体   繁体   中英

Sort the elements in Python List by date in it

I have list with all file names as below and i need to sort them and process in ascending order. code i used is working fine in python3 commandline but not working pyspark. Code i tried is

from datetime import datetime
def sorted_paths(paths):
    paths.sort(key = lambda path: datetime.strptime(path.split('_')[2], '%Y%m%d'))
    return paths

Gives an error:

Error: time data daily doesn't match the format '%Y%m%d'

Input List is as below:

file_d_20190101_htp.csv
file_d_20180401_html.csv
file_d_20200701_ksh.csv
file_d_20190301_htp.csv

Required output

file_d_20180401_html.csv
file_d_20190101_htp.csv
file_d_20190301_htp.csv
file_d_20200701_ksh.csv

You can try to use python embedded function sorted to resolve this:

import datetime

arr = ['file_d_20190101_htp.csv',
'file_d_20180401_html.csv',
'file_d_20200701_ksh.csv',
'file_d_20190301_htp.csv']


print(sorted(arr, key=lambda x: datetime.datetime.strptime(x.split("_")[2], '%Y%m%d')))

just do this, convenient and quick:

paths = ['file_d_20180401_html.csv',
 'file_d_20190301_htp.csv',
 'file_d_20180401_html.csv',
 'file_d_20200701_ksh.csv',
 'file_d_20190101_htp.csv',
]
paths.sort()  # in place sort

One way using dateutil.parser :

import dateutil.parser as dparser

f = lambda x: dparser.parse(x, fuzzy=True)
sorted(paths, key=f)

Output:

['file_d_20180401_html.csv',
 'file_d_20190101_htp.csv',
 'file_d_20190301_htp.csv',
 'file_d_20200701_ksh.csv']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM