简体   繁体   English

从日期列表中删除单词 DateTimeIndex

[英]Removing the words DateTimeIndex from a list of dates

I have a multiple list of dates in a pandas dataframe in this format:我在 Pandas 数据框中有多个日期列表,格式如下:

col1                       col2
1                          [DatetimeIndex(['2018-10-01', '2018-10-02', 
                           '2018-10-03', '2018-10-04'],
                            dtype='datetime64[ns]', freq='D')

I would like to take off the words DatetimeIndex and dtype='datetime64[ns]', freq='D' and turn the list into a set.我想dtype='datetime64[ns]', freq='D' DatetimeIndexdtype='datetime64[ns]', freq='D' ,把列表变成一个集合。 The format I would be looking for is: {'2018-10-01', '2018-10-02', '2018-10-03', '2018-10-04}我要寻找的格式是: {'2018-10-01', '2018-10-02', '2018-10-03', '2018-10-04}

Pandas is not designed to hold collections within series values, so what you are looking to do is strongly discouraged . Pandas 并非旨在将集合保存在系列值中,因此强烈不鼓励您执行此操作。 A much better idea, especially if you have a consistent number of values in each DatetimeIndex series value, is to join extra columns:一个更好的主意,特别是如果您在每个DatetimeIndex系列值中有一致数量的值,是加入额外的列:

D = pd.DatetimeIndex(['2018-10-01', '2018-10-02', '2018-10-03', '2018-10-04'],
                     dtype='datetime64[ns]', freq='D')

df = pd.DataFrame({'col1': [1], 'col2': [D]})

df = df.join(pd.DataFrame(df.pop('col2').values.tolist()))

print(df)

   col1          0          1          2          3
0     1 2018-10-01 2018-10-02 2018-10-03 2018-10-04

If you really want a set as each series value, you can do so via map + set :如果你真的想要一个set作为每个系列的值,你可以通过map + set这样做:

df['col2'] = list(map(set, df['col2'].values))

print(df)

   col1                                               col2
0     1  {2018-10-01 00:00:00, 2018-10-02 00:00:00, 201...

Have you tried:你有没有尝试过:

set(index_object.tolist())

I suspect this will return you a set of timestamp objects rather than strings so depends on your use case whether this is something you want我怀疑这会返回一组时间戳对象而不是字符串,因此取决于您的用例是否这是您想要的

if it's the strings you want you can modify the code as follows:如果是你想要的字符串,你可以修改代码如下:

set(index_object.dt.strftime("%Y-%m-%d").tolist())

For your specific format (which I don't necessarily approve of!) you can try this:对于您的特定格式(我不一定赞成!),您可以尝试以下操作:

import itertools
string_lists = col2.apply(lambda x: x.dt.strftime("%Y-%m-%d").tolist())
unique_set = set(itertools.chain.from_iterable(string_lists.tolist()))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM