[英]pandas df.apply TypeError data type not understood
I'm trying to apply an operation to every value in a datetime series. 我正在尝试对日期时间序列中的每个值应用运算。 I've reduced this to a lambda print to illustrate the problem.
我将其简化为lambda打印以说明问题。 This works in another similar dataframe but not on this one?
这适用于另一个类似的数据框,但不适用于该数据框吗? Python is version 3.5.1, pandas version 0.17.1.
Python是3.5.1版,pandas是0.17.1版。
Some more padding to satisfy the SO question verbosity requirement. 还有一些填充可以满足SO问题的详细程度要求。
print(dfY.info())
print(dfY)
dfY.apply(lambda rr: print(rr['predicted_time']), 1)
output 产量
<class 'pandas.core.frame.DataFrame'>
Int64Index: 21 entries, 0 to 20
Data columns (total 1 columns):
predicted_time 21 non-null datetime64[ns, pytz.FixedOffset(60)]
dtypes: datetime64[ns, pytz.FixedOffset(60)](1)
memory usage: 336.0 bytes
None
predicted_time
0 2005-02-01 02:40:00+01:00
1 2005-02-01 02:40:00+01:00
2 2005-02-01 02:40:00+01:00
3 2005-02-01 02:40:00+01:00
4 2005-02-01 02:43:00+01:00
5 2005-02-01 02:43:00+01:00
6 2005-02-01 02:43:00+01:00
<snip>
19 2005-02-01 02:50:00+01:00
20 2005-02-01 02:50:00+01:00
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-43-8ae0cf570812> in <module>()
1 print(dfY.info())
2 print(dfY)
----> 3 dfY.apply(lambda rr: print(rr['predicted_time']), 1)
/.../Projects/Software/TimeTillComplete/venv/lib/python3.5/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
3970 if reduce is None:
3971 reduce = True
-> 3972 return self._apply_standard(f, axis, reduce=reduce)
3973 else:
3974 return self._apply_broadcast(f, axis)
/.../Projects/Software/TimeTillComplete/venv/lib/python3.5/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
4017 # Create a dummy Series from an empty array
4018 index = self._get_axis(axis)
-> 4019 empty_arr = np.empty(len(index), dtype=values.dtype)
4020 dummy = Series(empty_arr, index=self._get_axis(axis),
4021 dtype=values.dtype)
TypeError: data type not understood
I don't really known what's going on, but as a workaround you can get the expected output calling apply()
on the column: 我真的不知道发生了什么,但是作为一种解决方法,您可以在列上调用
apply()
获得预期的输出:
dfY['predicted_time'].apply(lambda rr: print(rr))
EDIT Looks like you hit a bug in pandas. 编辑好像您遇到了一个熊猫中的错误。 The issue is triggered by using time zone aware timestamps in a dataframe.
通过在数据帧中使用时区感知时间戳来触发此问题。 Using a series works as seen above.
如上所示,使用系列作品。 Using naive timestamps also works:
使用朴素的时间戳也可以:
df = pd.DataFrame(pd.Series(dfY['predicted_time'].values),
columns=['predicted_time'])
df.apply(lambda rr: print(rr['predicted_time']), 1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.