I am reading an excel file and want to depricate a datetime column to the 1st of each month. The deprication works fine but pandas try to covert the strings to floats and throws an error when adding it as a coulmn of an existing dataframe.
How can I disable this, and just get a column with type of string or date?
I have tried varies mapping / type casting with no effect (same error). If I convert to a proxy int, the type casting problem disappear (since it can convert it to float) but it is a ugly workaround rather than solve the real problem.
Code snippet illustrating the problem
df = pd.read_excel(file_name, skiprows=[1], skip_footer=1)
print(df['Purch.Date'].dtype)
>>> datetime64[ns]
print(df['Purch.Date'].head())
>>> 0 2016-06-23
>>> 1 2016-06-09
>>> 2 2016-06-24
>>> 3 2016-06-24
>>> 4 2016-06-24
df['YearMonthCapture'] = df['Purch.Date'].map(lambda x: str(x.replace(day=1).date()) ).astype(str)
>>> ValueError: could not convert string to float: '2016-06-01'
# === Other approached resulting in same error ===
#df['YearMonthCapture'] = df['Purch.Date'].map(lambda x: x.replace(day=1))
#df['YearMonthCapture'] = pd.Series(df['Purch.Date'].map(lambda x: str(x.replace(day=1).date()) ), dtype='str')
#df['YearMonthCapture'] = pd.Series(df['Purch.Date'].apply(lambda x: str(x.replace(day=1).date()) ), dtype='str')
# === Ugly work around that does not really address the problem) ===
df['YearMonthCapture'] = pd.Series(df['Purch.Date'].apply(lambda x: 100*x.year + x.month)
You can do this by accessing the day
attribute and subtracting a TimedeltaIndex
from your datetime and casting to str:
In [138]:
df = pd.DataFrame({'date':pd.date_range(dt.datetime(2016,1,1), periods=4)})
df
Out[138]:
date
0 2016-01-01
1 2016-01-02
2 2016-01-03
3 2016-01-04
In [142]:
(df['date'] - pd.TimedeltaIndex(df['date'].dt.day - 1, unit='D')).astype(str)
Out[142]:
0 2016-01-01
1 2016-01-01
2 2016-01-01
3 2016-01-01
Name: date, dtype: object
So in your case:
df['YearMonthCapture'] = (df['Purch.Date'] - pd.TimedeltaIndex(df['Purch.Date'].dt.day - 1, unit='D')).astype(str)
should work
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.