简体   繁体   English

如何在熊猫中按日期时间对数据进行分组?

[英]How to group data by datetime in pandas?

I have a data which looks like below我有一个如下所示的数据

data = [(u'Del', datetime.datetime(2019, 11, 1, 0, 0), 59L), (u'Bom', datetime.datetime(2019, 11, 1, 0, 0), 449L), (u'Del', datetime.datetime(2019, 12, 1, 0, 0), 0L), (u'Bom', datetime.datetime(2019, 12, 1, 0, 0), 45L)]

Now I want to sub group the data based on time such that it looks something like this现在我想根据时间对数据进行分组,使其看起来像这样

data = [
         [(u'Del', datetime.datetime(2019, 11, 1, 0, 0), 59L), (u'Bom', datetime.datetime(2019, 11, 1, 0, 0), 449L)] 
        ,[(u'Del', datetime.datetime(2019, 12, 1, 0, 0), 0L), (u'Bom', datetime.datetime(2019, 12, 1, 0, 0), 45L)]
       ]

As you can see, now it is a list of lists where there are two lists inside a list where each list contains similar datetime .如您所见,现在它是一个列表列表,其中列表中有两个列表,其中每个列表包含相似的datetime For example the first sublist looks like this例如第一个子列表看起来像这样

[(u'Del', datetime.datetime(2019, 11, 1, 0, 0), 59L), (u'Bom', datetime.datetime(2019, 11, 1, 0, 0), 449L)]

Here the items of the first sublist contains similar date time which is datetime.datetime(2019, 11, 1, 0, 0)这里第一个子列表的项目包含类似的日期时间,即datetime.datetime(2019, 11, 1, 0, 0)

The second sublist looks like this第二个子列表看起来像这样

[(u'Del', datetime.datetime(2019, 12, 1, 0, 0), 0L), (u'Bom', datetime.datetime(2019, 12, 1, 0, 0), 45L)]

Here the items of the first sublist contains similar date time which is datetime.datetime(2019, 12, 1, 0, 0)这里第一个子列表的项目包含类似的日期时间,即datetime.datetime(2019, 12, 1, 0, 0)

I can sort the data based on datetime by doing something like this (though data is already sorted by datetime in this case)我可以通过执行这样的操作根据datetime对数据进行排序(尽管在这种情况下数据已经按datetime时间排序)

import pandas as pd
import datetime
import psycopg2

df = pd.DataFrame(data)
df['Date'] =pd.to_datetime(df[1])
df = df.sort_values(by='Date')

But I can't group them based on the sorted time.但我不能根据排序的时间对它们进行分组。 How do I achieve this using pandas ?我如何使用pandas实现这一目标?

You can do the following您可以执行以下操作

df = pd.DataFrame(data)
df.columns = ['place','date','value']


output = [x[1].values for x in df.groupby(date)]

output looks like:输出看起来像:

[[[u'Del', Timestamp('2019-11-01 00:00:00'), 59], [u'Bom', Timestamp('2019-11-01 00:00:00'), 449]], [[u'Del', Timestamp('2019-12-01 00:00:00'), 0], [u'Bom', Timestamp('2019-12-01 00:00:00'), 45]]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM