I'm creating a usage heatmap for some user analytics. The Y-axis will be day of the week and the X-axis will be hour of the day (24:00). I pulled the data from the API.(Note that this actually produces 6,000 rows of data)
IN:
import requests
import json
response = requests.get("api.url")
data = response.json()
df=pd.DataFrame(data['Sessions'])
df.dtypes
print(df['StartTime'])
OUT:
0 2019-01-29T22:08:40
1 2019-01-29T22:08:02
2 2019-01-29T22:05:10
3 2019-01-29T21:34:30
4 2019-01-29T21:32:49
Name: StartTime, Length: 100, dtype: object
I would normally convert the object into pandas.dt and then split it into two columns:
IN:
df['StartTime'] = pd.to_datetime(df['StartTime'], format='%d%b%Y:%H:%M:%S.%f')
df['Date'] = [d.date() for d in df['StartTime']]
df['Time'] = [d.time() for d in df['StartTime']]
OUT:
' StartTime Date Time
0 2019-01-29T22:08:40 2019-01-29 22:08:40
1 2019-01-29T22:08:02 2019-01-29 22:08:02
2 2019-01-29T22:05:10 2019-01-29 22:05:10
3 2019-01-29T21:34:30 2019-01-29 21:34:30
4 2019-01-29T21:32:49 2019-01-29 21:32:49
This isn't working because of that funky "T" in the middle of my timestamp and possibly because of the datatype.
I need to remove the T so I can convert this to a standard datetime format, then I need to separate Date and Time into their own columns. BONUS: I'd like to bring only the hour into its own column. Instead of 22:08:02, it would just be 22.
You need to use pandas timestamp:
>>> pd.Timestamp(‘2017-01-01T12’)
Timestamp(‘2017-01-01 12:00:00’)
So:
df['StartTime'] = df["StartTime"].apply(lambda x: pd.Timestamp(x))
#now StartTime has the correct data type so you can access
# date and time methods as well as the hour
df['Date'] = df["StartTime"].apply(lambda x: x.date())
df['Time'] = df["StartTime"].apply(lambda x: x.time())
df['Hour'] = df["StartTime"].apply(lambda x: x.hour)
As mentioned by @coldspeed, calling pd.to_datetime() or pd.Timesatmp() would work just fine, just ommit the format
arguments
For parsing the timestamp dateutil is fantastic. It can figure out a date from nearly any string format.
To get just the hour from a datetime object you can use d.hour
You don't need to format the timestamp. Pandas can recognize the datetime format as like '2019-01-29T21:34:30'.
IN:
import pandas as pd
dt = '2019-01-29T21:34:30'
pd.to_datetime(dt)
OUT:
Timestamp('2019-01-29 21:11:15')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.