簡體   English   中英

修改熊貓數據框以列出年份的月份和日期

[英]Modify Pandas dataframe to list year month and date

我想修改下面創建的數據框:

from datetime import date
from dateutil.rrule import rrule, DAILY, YEARLY
from dateutil.relativedelta import *
import pandas

START_YR = 2010
END_YR = 2013

strt_date = datetime.date(START_YR, 1, 1)
end_date  = datetime.date(END_YR, 12, 31)

dt = rrule(DAILY, dtstart=strt_date, until=end_date)

serie_1 = pandas.Series(np.random.randn(dt.count()), \
        index = pandas.date_range(strt_date, end_date))

如何創建以年月日為單獨列的數據框?

將系列轉換為DataFrame,然后將新列添加為Pandas期間。 如果只想將月份作為整數,請參見“ month_int”示例。

df = pd.DataFrame(serie_1)
df['month'] = [ts.to_period('M') for ts in df.index]
df['year'] = [ts.to_period('Y') for ts in df.index]
df['month_int'] = [ts.month for ts in df.index]

>>> df
Out[16]: 
                   0   month   year  month_int

2010-01-01  0.332370  2010-01  2010          1
2010-01-02 -0.036814  2010-01  2010          1
2010-01-03  1.751511  2010-01  2010          1
...              ...      ...   ...        ...
2013-12-29  0.345707  2013-12  2013         12
2013-12-30 -0.395924  2013-12  2013         12
2013-12-31 -0.614565  2013-12  2013         12

僅訪問datetime屬性將明顯更快:

df['date'] = df.index.date
df['year'] = df.index.year
df['month'] = df.index.month

將時間與列表理解方法進行比較:

In [25]:

%%timeit
df['month'] = [ts.to_period('M') for ts in df.index]
df['year'] = [ts.to_period('Y') for ts in df.index]
df['month_int'] = [ts.month for ts in df.index]
1 loops, best of 3: 664 ms per loop
In [26]:

%%timeit
df['date'] = df.index.date
df['year'] = df.index.year
df['month'] = df.index.month

100 loops, best of 3: 5.96 ms per loop

因此,使用datetime屬性的速度快100倍以上

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM