Sort the elements in Python List by date in it

Question

I have list with all file names as below and i need to sort them and process in ascending order. code i used is working fine in python3 commandline but not working pyspark. Code i tried is

from datetime import datetime
def sorted_paths(paths):
    paths.sort(key = lambda path: datetime.strptime(path.split('_')[2], '%Y%m%d'))
    return paths

Gives an error:

Error: time data daily doesn't match the format '%Y%m%d'

Input List is as below:

file_d_20190101_htp.csv
file_d_20180401_html.csv
file_d_20200701_ksh.csv
file_d_20190301_htp.csv

Required output

file_d_20180401_html.csv
file_d_20190101_htp.csv
file_d_20190301_htp.csv
file_d_20200701_ksh.csv

Answer 1

You can try to use python embedded function sorted to resolve this:

import datetime

arr = ['file_d_20190101_htp.csv',
'file_d_20180401_html.csv',
'file_d_20200701_ksh.csv',
'file_d_20190301_htp.csv']


print(sorted(arr, key=lambda x: datetime.datetime.strptime(x.split("_")[2], '%Y%m%d')))

Answer 2

just do this, convenient and quick:

paths = ['file_d_20180401_html.csv',
 'file_d_20190301_htp.csv',
 'file_d_20180401_html.csv',
 'file_d_20200701_ksh.csv',
 'file_d_20190101_htp.csv',
]
paths.sort()  # in place sort

Answer 3

One way using dateutil.parser :

import dateutil.parser as dparser

f = lambda x: dparser.parse(x, fuzzy=True)
sorted(paths, key=f)

Output:

['file_d_20180401_html.csv',
 'file_d_20190101_htp.csv',
 'file_d_20190301_htp.csv',
 'file_d_20200701_ksh.csv']

Sort the elements in Python List by date in it

Question

3 answers

solution1
0 2020-12-02 12:59:20

solution2
0 ACCPTED 2020-12-02 13:03:33

solution3
0 2020-12-02 13:10:39

Sort the elements in Python List by date in it

Question

3 answers

solution1 0 2020-12-02 12:59:20

solution2 0 ACCPTED 2020-12-02 13:03:33

solution3 0 2020-12-02 13:10:39

solution1
0 2020-12-02 12:59:20

solution2
0 ACCPTED 2020-12-02 13:03:33

solution3
0 2020-12-02 13:10:39