[英]How do I “enrich” every record in a Pandas dataframe with an hour column?
I have some dataframe in Pandas: 我在Pandas中有一些数据框:
1 2
a .2
a .3
b .5
I would like to add, for each of those records, a column with hour (from 0 to 23), so it will look like 我想为每个记录添加一个带有小时(从0到23)的列,因此它看起来像
1 2 3
a .2 0
a .2 1
a .2 2
...
a .2 23
a .3 0
a .3 1
...
a .3 23
b .5 0
...
b .5 23
Create the hours array: 创建小时数组:
import numpy as np
hours = np.tile(np.arange(24), len(df))
Repeat each record of df
by 24 times: 将df
的每个记录重复24次:
df = df.loc[df.index.repeat(24)].reset_index(drop=True)
Assign the hours array as a new column to the data frame: 将小时数组分配为数据框的新列:
df[3] = hours
df.head()
# 1 2 3
#0 a 0.2 0
#1 a 0.2 1
#2 a 0.2 2
#3 a 0.2 3
#4 a 0.2 4
Put together: 放在一起:
def expand_hours(df):
import numpy as np
hours = np.tile(np.arange(24), len(df))
df = df.loc[df.index.repeat(24)].reset_index(drop=True)
df[3] = hours
return df
If your DataFrame is called df
try this: 如果您的DataFrame称为df
尝试以下操作:
df['hour'] = Series(np.random.randint(0,24), index=df.index)
This should add a column with name 'hour' filled with integers generated between 0 and 23. 这应添加一个名称为“ hour”的列,其中填充了介于0到23之间的整数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.