[英]Python3.6: Separation of a list into sublists depending on given elements of the list
I have some data, which looks like shown below. 我有一些数据,如下所示。 What I tried for quite a while now is to seperate this list into sublists, such that every sublist represents one date given in the first row.
我现在尝试了很长一段时间是将这个列表分成子列表,这样每个子列表代表第一行中给出的一个日期。 We have 5 different days in the example and I'd like to have the original list divided into 5 respective lists.
我们在示例中有5个不同的日子,我希望将原始列表分为5个相应的列表。
The problem doesn't seem too complicated, but I tried for a while now and for some reason I can't wrap my mind around it. 这个问题似乎并不太复杂,但我现在尝试了一段时间,由于某些原因,我不能把它包裹起来。
I would appreciate any solutions from you guys. 我很感激你们的任何解决方案。 Of course the original data is way larger.
当然原始数据更大。
listofstrings=[
"17.02.2018 14:30:24 00000000 23,7 23,9 -2,0 1,1",
"17.02.2018 15:00:21 00000000 23,7 23,8 -4,0 1,1",
"19.02.2018 18:30:24 00000000 23,6 23,7 -3,0 1,1",
"19.02.2018 19:00:21 00000000 23,6 23,6 -7,0 1,1",
"19.02.2018 19:30:22 00000000 23,5 23,5 -5,0 1,1",
"20.02.2018 05:30:21 00000000 23,5 23,8 -3,0 1,1",
"20.02.2018 06:00:21 00000000 23,5 23,8 1,0 1,1",
"20.02.2018 16:00:22 00000000 23,6 23,8 -4,0 1,1",
"21.02.2018 05:00:22 00000000 23,6 23,7 0,0 1,1",
"21.02.2018 05:30:23 00000000 23,6 23,8 -6,0 1,1",
"22.02.2018 07:30:23 00000000 23,6 23,8 -6,0 1,1",
"22.02.2018 08:00:21 00000000 23,6 23,9 -3,0 1,1",
"22.02.2018 13:30:25 00000000 23,6 23,8 -3,0 1,1"]
listoflists=[]
locallist=[]
for i in range(0, len(listofstrings)):
current_string=listofstrings[i]
current_date=current_string.split()[0]
if not i==0:
recent_string=listofstrings[i-1]
recent_date=recent_string.split()[0]
if current_date==recent_date:
locallist.append(current_string)
locallist.append(recent_string)
listoflists.append(locallist)
locallist.clear()
The expected output would be something like this: 预期的输出将是这样的:
list1=["17.02.2018 14:30:24 00000000 23,7 23,9 -2,0 1,1",
"17.02.2018 15:00:21 00000000 23,7 23,8 -4,0 1,1"]
list2=["19.02.2018 18:30:24 00000000 23,6 23,7 -3,0 1,1",
"19.02.2018 19:00:21 00000000 23,6 23,6 -7,0 1,1",
"19.02.2018 19:30:22 00000000 23,5 23,5 -5,0 1,1",]
....
Looks like you need itertools.groupby
看起来你需要
itertools.groupby
Demo: 演示:
from itertools import groupby
listofstrings=[
"17.02.2018 14:30:24 00000000 23,7 23,9 -2,0 1,1",
"17.02.2018 15:00:21 00000000 23,7 23,8 -4,0 1,1",
"19.02.2018 18:30:24 00000000 23,6 23,7 -3,0 1,1",
"19.02.2018 19:00:21 00000000 23,6 23,6 -7,0 1,1",
"19.02.2018 19:30:22 00000000 23,5 23,5 -5,0 1,1",
"20.02.2018 05:30:21 00000000 23,5 23,8 -3,0 1,1",
"20.02.2018 06:00:21 00000000 23,5 23,8 1,0 1,1",
"20.02.2018 16:00:22 00000000 23,6 23,8 -4,0 1,1",
"21.02.2018 05:00:22 00000000 23,6 23,7 0,0 1,1",
"21.02.2018 05:30:23 00000000 23,6 23,8 -6,0 1,1",
"22.02.2018 07:30:23 00000000 23,6 23,8 -6,0 1,1",
"22.02.2018 08:00:21 00000000 23,6 23,9 -3,0 1,1",
"22.02.2018 13:30:25 00000000 23,6 23,8 -3,0 1,1"]
listofstrings = [i.split() for i in listofstrings]
result = dict((k, list(v)) for k, v in groupby(listofstrings, lambda x: x[0]))
print(result)
Output: 输出:
{'17.02.2018': [['17.02.2018', '14:30:24', '00000000', '23,7', '23,9', '-2,0', '1,1'], ['17.02.2018', '15:00:21', '00000000', '23,7', '23,8', '-4,0', '1,1']],
'19.02.2018': [['19.02.2018', '18:30:24', '00000000', '23,6', '23,7', '-3,0', '1,1'], ['19.02.2018', '19:00:21', '00000000', '23,6', '23,6', '-7,0', '1,1'], ['19.02.2018', '19:30:22', '00000000', '23,5', '23,5', '-5,0', '1,1']],
'22.02.2018': [['22.02.2018', '07:30:23', '00000000', '23,6', '23,8', '-6,0', '1,1'], ['22.02.2018', '08:00:21', '00000000', '23,6', '23,9', '-3,0', '1,1'], ['22.02.2018', '13:30:25', '00000000', '23,6', '23,8', '-3,0', '1,1']],
'21.02.2018': [['21.02.2018', '05:00:22', '00000000', '23,6', '23,7', '0,0', '1,1'], ['21.02.2018', '05:30:23', '00000000', '23,6', '23,8', '-6,0', '1,1']],
'20.02.2018': [['20.02.2018', '05:30:21', '00000000', '23,5', '23,8', '-3,0', '1,1'], ['20.02.2018', '06:00:21', '00000000', '23,5', '23,8', '1,0', '1,1'], ['20.02.2018', '16:00:22', '00000000', '23,6', '23,8', '-4,0', '1,1']]}
Or: 要么:
result = dict((k, list(v)) for k, v in groupby(listofstrings, lambda x: x[:10]))
Output: 输出:
{'17.02.2018': ['17.02.2018 14:30:24 00000000 23,7 23,9 -2,0 1,1', '17.02.2018 15:00:21 00000000 23,7 23,8 -4,0 1,1'],
'19.02.2018': ['19.02.2018 18:30:24 00000000 23,6 23,7 -3,0 1,1', '19.02.2018 19:00:21 00000000 23,6 23,6 -7,0 1,1', '19.02.2018 19:30:22 00000000 23,5 23,5 -5,0 1,1'],
'22.02.2018': ['22.02.2018 07:30:23 00000000 23,6 23,8 -6,0 1,1', '22.02.2018 08:00:21 00000000 23,6 23,9 -3,0 1,1', '22.02.2018 13:30:25 00000000 23,6 23,8 -3,0 1,1'],
'21.02.2018': ['21.02.2018 05:00:22 00000000 23,6 23,7 0,0 1,1', '21.02.2018 05:30:23 00000000 23,6 23,8 -6,0 1,1'],
'20.02.2018': ['20.02.2018 05:30:21 00000000 23,5 23,8 -3,0 1,1', '20.02.2018 06:00:21 00000000 23,5 23,8 1,0 1,1', '20.02.2018 16:00:22 00000000 23,6 23,8 -4,0 1,1']}
Here is a solution that requires no imported modules. 这是一个不需要导入模块的解决方案。
l = listofstrings # an alias for conciseness
d={st[:10]:[] for st in l}
for st in l:
d[st[:10]] += [st]
explanation: first create an empty list in dictionary d, where key is the first 10 characters of each of your input strings, ie the date. 解释:首先在字典d中创建一个空列表,其中key是每个输入字符串的前10个字符,即日期。 This exploits the fact that dict keys cannot be duplicated.
这利用了dict键不能重复的事实。 In effect you get a collection of unique dates from your input.
实际上,您可以从输入中获得一组唯一日期。
Then for each input string add the "payload" to the list under a given key. 然后,对于每个输入字符串,将“有效负载”添加到给定密钥下的列表中。 Again, the keys will define which list the string is appended to.
同样,键将定义附加字符串的列表。
After we are done, d is your desired data structure. 完成后, d是您想要的数据结构。
This is very similar to ilia's solution above. 这与上面的ilia的解决方案非常相似。 This is without list comprehension and at the end the output is a list of lists instead of a dictionary.
这是没有列表理解的,最后输出是列表而不是字典。
listofstrings = [
"17.02.2018 14:30:24 00000000 23,7 23,9 -2,0 1,1",
"17.02.2018 15:00:21 00000000 23,7 23,8 -4,0 1,1",
"19.02.2018 18:30:24 00000000 23,6 23,7 -3,0 1,1",
"19.02.2018 19:00:21 00000000 23,6 23,6 -7,0 1,1",
"19.02.2018 19:30:22 00000000 23,5 23,5 -5,0 1,1",
"20.02.2018 05:30:21 00000000 23,5 23,8 -3,0 1,1",
"20.02.2018 06:00:21 00000000 23,5 23,8 1,0 1,1",
"20.02.2018 16:00:22 00000000 23,6 23,8 -4,0 1,1",
"21.02.2018 05:00:22 00000000 23,6 23,7 0,0 1,1",
"21.02.2018 05:30:23 00000000 23,6 23,8 -6,0 1,1",
"22.02.2018 07:30:23 00000000 23,6 23,8 -6,0 1,1",
"22.02.2018 08:00:21 00000000 23,6 23,9 -3,0 1,1",
"22.02.2018 13:30:25 00000000 23,6 23,8 -3,0 1,1"]
_list = {}
for d in listofstrings:
if d[:10] not in _list:
_list[d[: 10]] = [d]
else:
_list[d[:10]].append(d)
_list_of_lists = []
for k, v in _list.items():
_list_of_lists.append(v)
print(*_list_of_lists, sep="\n")
output: 输出:
['17.02.2018 14:30:24 00000000 23,7 23,9 -2,0 1,1', '17.02.2018 15:00:21 00000000 23,7 23,8 -4,0 1,1']
['19.02.2018 18:30:24 00000000 23,6 23,7 -3,0 1,1', '19.02.2018 19:00:21 00000000 23,6 23,6 -7,0 1,1', '19.02.2018 19:30:22 00000000 23,5 23,5 -5,0 1,1']
['20.02.2018 05:30:21 00000000 23,5 23,8 -3,0 1,1', '20.02.2018 06:00:21 00000000 23,5 23,8 1,0 1,1', '20.02.2018 16:00:22 00000000 23,6 23,8 -4,0 1,1']
['21.02.2018 05:00:22 00000000 23,6 23,7 0,0 1,1', '21.02.2018 05:30:23 00000000 23,6 23,8 -6,0 1,1']
['22.02.2018 07:30:23 00000000 23,6 23,8 -6,0 1,1', '22.02.2018 08:00:21 00000000 23,6 23,9 -3,0 1,1', '22.02.2018 13:30:25 00000000 23,6 23,8 -3,0 1,1']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.