简体   繁体   中英

Sorting a list of strings based on the first element from split (datetime)

I have a long list of strings, separated by commas (basically, csv files read line by line to strings, not performing a split on the separator):

lines[0] = "2017-08-01 13:45:58,mytext,mytext2,mytext3,etc"
lines[1] = "2017-08-01 15:45:58,mytextx,mytext2x,mytext3x,etcx"
lines[2] = "2017-08-01 19:45:58,mytexty,mytext2y,mytext3y,etcy"
lines[3] = "..."

From this post I know that the following code should work if my lines would only consist of datetimes:

lines_sorted = sorted(lines, key=lambda x: datetime.datetime.strptime(lines, '%Y-%m-%d %H:%M:%S'))

I thought I could use partition to extract tuples from all lines in files, where the first element contains the datetimepart:

for unsortedFile in glob('*.txt'):
    with open(unsortedFile, 'r') as file:
        lines = [line.rstrip('\n').partition(',') for line in file]
        lines_sorted = sorted(lines, key=lambda x: datetime.datetime.strptime(lines[0], '%Y-%m-%d %H:%M:%S'))

..but of course, this does not work "TypeError: list indices must be integers or slices, not str" because lines[0] is not referencing the first tuple but the first item in lines-list. I also tried using .strptime(lines[lambda][0], '%Y-%m-%d %H:%M:%S')) but it is neither working.

I know I am doing something wrong.. any help is much appreciated.

[edit] Here's the answer, from friendly comments below:

for unsortedFile in glob('*.txt'):
    with open(unsortedFile, 'r', encoding="utf8") as file: #read each unsorted file to lines (list)
        lines = [line.rstrip('\n') for line in file]
        lines_sorted = sorted(lines,
                    key=lambda x: x.split(',', maxsplit=1)[0]
                    )
        lines.clear()
    with open(unsortedFile,'w', encoding="utf8") as file: #overwrite file
        for line in lines_sorted:
            file.write(line + '\n')

Just take the first element of the split :

lines_sorted = sorted(
    lines, 
    key=lambda x: datetime.datetime.strptime(x.split(",")[0], 
                                            '%Y-%m-%d %H:%M:%S'
))

This way you are just taking the datetime for the sorting while keeping the original data.

basically the key argument of the sorted function must be a function which takes a list item and returns a comparable object.
sorted will sort the list according to the image of the list items by this function, not the items themselves.

Here is an example, which is a mix of the suggested solutions :

lines_sorted = sorted(lines,
                      key=lambda x: x.split(',', maxsplit=1)[0]
                     )

With this code, every item which has the same date will be considered equal by sorted .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM