简体   繁体   中英

Python3.6: Separation of a list into sublists depending on given elements of the list

I have some data, which looks like shown below. What I tried for quite a while now is to seperate this list into sublists, such that every sublist represents one date given in the first row. We have 5 different days in the example and I'd like to have the original list divided into 5 respective lists.

The problem doesn't seem too complicated, but I tried for a while now and for some reason I can't wrap my mind around it.

I would appreciate any solutions from you guys. Of course the original data is way larger.

listofstrings=[
"17.02.2018 14:30:24    00000000    23,7    23,9    -2,0    1,1",
"17.02.2018 15:00:21    00000000    23,7    23,8    -4,0    1,1",
"19.02.2018 18:30:24    00000000    23,6    23,7    -3,0    1,1",
"19.02.2018 19:00:21    00000000    23,6    23,6    -7,0    1,1",
"19.02.2018 19:30:22    00000000    23,5    23,5    -5,0    1,1",
"20.02.2018 05:30:21    00000000    23,5    23,8    -3,0    1,1",
"20.02.2018 06:00:21    00000000    23,5    23,8    1,0 1,1",
"20.02.2018 16:00:22    00000000    23,6    23,8    -4,0    1,1",
"21.02.2018 05:00:22    00000000    23,6    23,7    0,0 1,1",
"21.02.2018 05:30:23    00000000    23,6    23,8    -6,0    1,1",
"22.02.2018 07:30:23    00000000    23,6    23,8    -6,0    1,1",
"22.02.2018 08:00:21    00000000    23,6    23,9    -3,0    1,1",
"22.02.2018 13:30:25    00000000    23,6    23,8    -3,0    1,1"]

listoflists=[]
locallist=[]

for i in range(0, len(listofstrings)):

    current_string=listofstrings[i]
    current_date=current_string.split()[0]
    if not i==0:
        recent_string=listofstrings[i-1] 
        recent_date=recent_string.split()[0]

        if current_date==recent_date:
            locallist.append(current_string)
            locallist.append(recent_string)

    listoflists.append(locallist)
    locallist.clear()

The expected output would be something like this:

list1=["17.02.2018 14:30:24    00000000    23,7    23,9    -2,0    1,1",
       "17.02.2018 15:00:21    00000000    23,7    23,8    -4,0    1,1"]
list2=["19.02.2018 18:30:24    00000000    23,6    23,7    -3,0    1,1",
       "19.02.2018 19:00:21    00000000    23,6    23,6    -7,0    1,1",
       "19.02.2018 19:30:22    00000000    23,5    23,5    -5,0    1,1",]
....

Looks like you need itertools.groupby

Demo:

from itertools import groupby

listofstrings=[
"17.02.2018 14:30:24    00000000    23,7    23,9    -2,0    1,1",
"17.02.2018 15:00:21    00000000    23,7    23,8    -4,0    1,1",
"19.02.2018 18:30:24    00000000    23,6    23,7    -3,0    1,1",
"19.02.2018 19:00:21    00000000    23,6    23,6    -7,0    1,1",
"19.02.2018 19:30:22    00000000    23,5    23,5    -5,0    1,1",
"20.02.2018 05:30:21    00000000    23,5    23,8    -3,0    1,1",
"20.02.2018 06:00:21    00000000    23,5    23,8    1,0 1,1",
"20.02.2018 16:00:22    00000000    23,6    23,8    -4,0    1,1",
"21.02.2018 05:00:22    00000000    23,6    23,7    0,0 1,1",
"21.02.2018 05:30:23    00000000    23,6    23,8    -6,0    1,1",
"22.02.2018 07:30:23    00000000    23,6    23,8    -6,0    1,1",
"22.02.2018 08:00:21    00000000    23,6    23,9    -3,0    1,1",
"22.02.2018 13:30:25    00000000    23,6    23,8    -3,0    1,1"]


listofstrings = [i.split() for i in listofstrings]
result = dict((k, list(v)) for k, v in groupby(listofstrings, lambda x: x[0]))
print(result)

Output:

{'17.02.2018': [['17.02.2018', '14:30:24', '00000000', '23,7', '23,9', '-2,0', '1,1'], ['17.02.2018', '15:00:21', '00000000', '23,7', '23,8', '-4,0', '1,1']], 
 '19.02.2018': [['19.02.2018', '18:30:24', '00000000', '23,6', '23,7', '-3,0', '1,1'], ['19.02.2018', '19:00:21', '00000000', '23,6', '23,6', '-7,0', '1,1'], ['19.02.2018', '19:30:22', '00000000', '23,5', '23,5', '-5,0', '1,1']],
 '22.02.2018': [['22.02.2018', '07:30:23', '00000000', '23,6', '23,8', '-6,0', '1,1'], ['22.02.2018', '08:00:21', '00000000', '23,6', '23,9', '-3,0', '1,1'], ['22.02.2018', '13:30:25', '00000000', '23,6', '23,8', '-3,0', '1,1']],
 '21.02.2018': [['21.02.2018', '05:00:22', '00000000', '23,6', '23,7', '0,0', '1,1'], ['21.02.2018', '05:30:23', '00000000', '23,6', '23,8', '-6,0', '1,1']],
 '20.02.2018': [['20.02.2018', '05:30:21', '00000000', '23,5', '23,8', '-3,0', '1,1'], ['20.02.2018', '06:00:21', '00000000', '23,5', '23,8', '1,0', '1,1'], ['20.02.2018', '16:00:22', '00000000', '23,6', '23,8', '-4,0', '1,1']]}

Or:

result = dict((k, list(v)) for k, v in groupby(listofstrings, lambda x: x[:10]))

Output:

{'17.02.2018': ['17.02.2018 14:30:24    00000000    23,7    23,9    -2,0    1,1', '17.02.2018 15:00:21    00000000    23,7    23,8    -4,0    1,1'],
 '19.02.2018': ['19.02.2018 18:30:24    00000000    23,6    23,7    -3,0    1,1', '19.02.2018 19:00:21    00000000    23,6    23,6    -7,0    1,1', '19.02.2018 19:30:22    00000000    23,5    23,5    -5,0    1,1'],
 '22.02.2018': ['22.02.2018 07:30:23    00000000    23,6    23,8    -6,0    1,1', '22.02.2018 08:00:21    00000000    23,6    23,9    -3,0    1,1', '22.02.2018 13:30:25    00000000    23,6    23,8    -3,0    1,1'],
 '21.02.2018': ['21.02.2018 05:00:22    00000000    23,6    23,7    0,0 1,1', '21.02.2018 05:30:23    00000000    23,6    23,8    -6,0    1,1'],
 '20.02.2018': ['20.02.2018 05:30:21    00000000    23,5    23,8    -3,0    1,1', '20.02.2018 06:00:21    00000000    23,5    23,8    1,0 1,1', '20.02.2018 16:00:22    00000000    23,6    23,8    -4,0    1,1']}

Here is a solution that requires no imported modules.

l = listofstrings  # an alias for conciseness
d={st[:10]:[] for st in l}
for st in l:
    d[st[:10]] += [st]

explanation: first create an empty list in dictionary d, where key is the first 10 characters of each of your input strings, ie the date. This exploits the fact that dict keys cannot be duplicated. In effect you get a collection of unique dates from your input.

Then for each input string add the "payload" to the list under a given key. Again, the keys will define which list the string is appended to.

After we are done, d is your desired data structure.

This is very similar to ilia's solution above. This is without list comprehension and at the end the output is a list of lists instead of a dictionary.

listofstrings = [
    "17.02.2018 14:30:24    00000000    23,7    23,9    -2,0    1,1",
    "17.02.2018 15:00:21    00000000    23,7    23,8    -4,0    1,1",
    "19.02.2018 18:30:24    00000000    23,6    23,7    -3,0    1,1",
    "19.02.2018 19:00:21    00000000    23,6    23,6    -7,0    1,1",
    "19.02.2018 19:30:22    00000000    23,5    23,5    -5,0    1,1",
    "20.02.2018 05:30:21    00000000    23,5    23,8    -3,0    1,1",
    "20.02.2018 06:00:21    00000000    23,5    23,8    1,0     1,1",
    "20.02.2018 16:00:22    00000000    23,6    23,8    -4,0    1,1",
    "21.02.2018 05:00:22    00000000    23,6    23,7    0,0     1,1",
    "21.02.2018 05:30:23    00000000    23,6    23,8    -6,0    1,1",
    "22.02.2018 07:30:23    00000000    23,6    23,8    -6,0    1,1",
    "22.02.2018 08:00:21    00000000    23,6    23,9    -3,0    1,1",
    "22.02.2018 13:30:25    00000000    23,6    23,8    -3,0    1,1"]
    _list = {}

for d in listofstrings:
    if d[:10] not in _list:
        _list[d[: 10]] = [d]
    else:
        _list[d[:10]].append(d)

_list_of_lists = []
for k, v in _list.items():
    _list_of_lists.append(v)

print(*_list_of_lists, sep="\n")

output:

['17.02.2018 14:30:24    00000000    23,7    23,9    -2,0    1,1', '17.02.2018 15:00:21    00000000    23,7    23,8    -4,0    1,1']
['19.02.2018 18:30:24    00000000    23,6    23,7    -3,0    1,1', '19.02.2018 19:00:21    00000000    23,6    23,6    -7,0    1,1', '19.02.2018 19:30:22    00000000    23,5    23,5    -5,0    1,1']
['20.02.2018 05:30:21    00000000    23,5    23,8    -3,0    1,1', '20.02.2018 06:00:21    00000000    23,5    23,8    1,0     1,1', '20.02.2018 16:00:22    00000000    23,6    23,8    -4,0    1,1']
['21.02.2018 05:00:22    00000000    23,6    23,7    0,0     1,1', '21.02.2018 05:30:23    00000000    23,6    23,8    -6,0    1,1']
['22.02.2018 07:30:23    00000000    23,6    23,8    -6,0    1,1', '22.02.2018 08:00:21    00000000    23,6    23,9    -3,0    1,1', '22.02.2018 13:30:25    00000000    23,6    23,8    -3,0    1,1']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM