简体   繁体   中英

TypeError: strptime() argument 1 must be str, not Period

i have this data frame.

import pandas as pd
from datetime import datetime
df = pd.DataFrame({'id': [11,22,33,44,55], 
                   'name': ['A','B','C','D','E'], 
                   'timestamp': [1407617838,965150022,1158531592,1500701864,965149631]})
df
   id name timestamp
0  11    A      2014
1  22    B      2000
2  33    C      2006
3  44    D      2017
4  55    E      2000
df['timestamp'] = pd.to_datetime(df['timestamp'], unit='s')
df['timestamp'] = df['timestamp'].dt.to_period('Y')
y1 = df['timestamp'].iloc[0]
y2 = df['timestamp'].iloc[1]
d1 = datetime.strptime(y1, "%Y")
d2 = datetime.strptime(y2, "%Y")
diff = abs((d2 - d1).days)
print(diff)

i have converted the timestamp into real dates and fetched years. i want two take difference between first two rows of timestamp. For example (abs (2014-2000) = 4)

If you take the year through the dt acessor of timeseries , you get integers (instead of "Period" objects):

df['timestamp'] = pd.to_datetime(df['timestamp'], unit='s')
df['timestamp'] = df['timestamp'].dt.year
y1 = df['timestamp'].iloc[0]
y2 = df['timestamp'].iloc[1]
# d1 = datetime.strptime(y1, "%Y") <- No need to recast to datetime!
# d2 = datetime.strptime(y2, "%Y")
diff = abs((y2 - y1))
print(diff)
>>> 14

As you see, I commented the two lines were you were trying to recast the years into datetime objects. Was there a reason for this? From your question, I assumed you wanted the difference in number of years. If you wanted the exact number of days between the timestamps then this should do: (no need to cast and recast):

df['timestamp'] = pd.to_datetime(df['timestamp'], unit='s')
y1 = df['timestamp'].iloc[0]
y2 = df['timestamp'].iloc[1]
diff = abs((y2 - y1).days)
print(diff)
>>> 5122

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM