两列之间的日期范围

Question

Im kinda new to Python and Datascience.我对 Python 和数据科学有点陌生。

I have a Dataset with 2 datetime columns A and B :我有一个包含 2 个日期时间列A和B的数据集：

                     A                    B
0  2019-03-13 08:12:20  2019-03-13 08:12:25
1  2019-03-15 10:02:18  2019-03-13 10:02:20

For each row, i want to generate the date range in seconds between column A and column B, so as a result i should get this:对于每一行，我想在 A 列和 B 列之间生成以秒为单位的日期范围，因此我应该得到这个：

                    A
0 2019-03-13 08:12:20
1 2019-03-13 08:12:21
2 2019-03-13 08:12:22
3 2019-03-13 08:12:23
4 2019-03-13 08:12:24
5 2019-03-13 08:12:25

I made it work with this:我使它与这个一起工作：

import pandas as pd, numpy as np

df=pd.DataFrame({'A': ["2019-03-13 08:12:20", "2019-03-15 10:02:18"], 'B': ["2019-03-13 08:12:25", "2019-03-13 10:02:20"]})
l=[pd.date_range(start=df.iloc[i]['A'], end=df.iloc[i]['B'], freq='S') for i in range(len(df))]
df1=(pd.DataFrame(l).T)[0]
print(df1)

But as i have like 1M rows, it's taking too much time to run and i know that this solution isn't really the best, can you please guys show me whats the best way to do this?但是因为我有 1M 行，所以运行时间太长，而且我知道这个解决方案并不是最好的，你们能告诉我最好的方法是什么吗？

Answer 1

Here is necessary loop, one possible solution with list comprehension and flattening:这是必要的循环，一种可能的列表理解和展平解决方案：

l = [x for a, b in zip(df.A, df.B) for x in pd.date_range(a, b, freq='S')]
df1= pd.DataFrame({'A':l})
print(df1)
                    A
0 2019-03-13 08:12:20
1 2019-03-13 08:12:21
2 2019-03-13 08:12:22
3 2019-03-13 08:12:23
4 2019-03-13 08:12:24
5 2019-03-13 08:12:25

Another solution:另一种解决方案：

df1 = (pd.concat([pd.Series(pd.date_range(r.A, r.B, freq='S')) for r in df.itertuples()])
         .to_frame('A'))
print (df1)
                    A
0 2019-03-13 08:12:20
1 2019-03-13 08:12:21
2 2019-03-13 08:12:22
3 2019-03-13 08:12:23
4 2019-03-13 08:12:24
5 2019-03-13 08:12:25

两列之间的日期范围

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-10-24 13:11:34

两列之间的日期范围

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-10-24 13:11:34

解决方案1
0 已采纳 2019-10-24 13:11:34