简体   繁体   中英

Python Pandas date time index create dataframe

I have some time series data where I am attempting to create separate dataframes in Pandas that will be a 0 or 1 based on if the index is a particular day of the week and another for a particular time.

For example I can make up some data with:

import pandas as pd
import numpy as np
from numpy.random import randint

#time = pd.date_range('6/28/2013', periods=2000, freq='5min')
#df = pd.Series(np.random.randint(100, size=2000), index=time)

rng = pd.date_range('10/9/2018 00:00', periods=5, freq='6H')
df = pd.DataFrame({'Random_Number':randint(1, 10, 5)}, index=rng)
df.head()

And if I am doing this correctly I can create a dataframe named Tuesday that will be a 1 if the day = Tuesday else a 0

#The day of the week with Monday=0, Sunday=6
df['Tuesday'] = np.where(df.index.dayofweek == 1, 1, 0)

df.head()

What I am struggling with (In excel I can do with embedded if else statements) is creating a dataframe called occupied if the time is in between 7AM & 5PM. Any tips help, thanks in advance!

df['Occupied'] = np.where(df.index.hour > 7 & df.index.hour < 17, 1, 0)

df.head()

This code errors out with a type error that I am not sure what to do about:

TypeError: unsupported operand type(s) for &: 'int' and 'Int64Index'

You are missing the ()

np.where((df.index.hour > 7) & (df.index.hour < 17), 1, 0)
Out[157]: array([0, 0, 1, 0, 0])

You can use pd.DataFrame.eval :

df['Occupied'] = df.eval('7 <= index.dt.hour < 17').astype(int)

print(df)

                     Random_Number  Occupied
2018-10-09 00:00:00              8         0
2018-10-09 06:00:00              8         0
2018-10-09 12:00:00              8         1
2018-10-09 18:00:00              3         0
2018-10-10 00:00:00              2         0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM