熊猫df.apply TypeError数据类型不了解

[英]pandas df.apply TypeError data type not understood

I'm trying to apply an operation to every value in a datetime series. 我正在尝试对日期时间序列中的每个值应用运算。 I've reduced this to a lambda print to illustrate the problem. 我将其简化为lambda打印以说明问题。 This works in another similar dataframe but not on this one? 这适用于另一个类似的数据框,但不适用于该数据框吗? Python is version 3.5.1, pandas version 0.17.1. Python是3.5.1版,pandas是0.17.1版。

dfY.apply(lambda rr: print(rr['predicted_time']), 1)

output 产量

<class 'pandas.core.frame.DataFrame'>
Int64Index: 21 entries, 0 to 20
Data columns (total 1 columns):
predicted_time    21 non-null datetime64[ns, pytz.FixedOffset(60)]
dtypes: datetime64[ns, pytz.FixedOffset(60)](1)
memory usage: 336.0 bytes
0  2005-02-01 02:40:00+01:00
1  2005-02-01 02:40:00+01:00
2  2005-02-01 02:40:00+01:00
3  2005-02-01 02:40:00+01:00
4  2005-02-01 02:43:00+01:00
5  2005-02-01 02:43:00+01:00
6  2005-02-01 02:43:00+01:00
19 2005-02-01 02:50:00+01:00
20 2005-02-01 02:50:00+01:00

TypeError                                 Traceback (most recent call last)
<ipython-input-43-8ae0cf570812> in <module>()
      1 print(dfY.info())
      2 print(dfY)
----> 3 dfY.apply(lambda rr: print(rr['predicted_time']), 1)

/.../Projects/Software/TimeTillComplete/venv/lib/python3.5/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   3970                     if reduce is None:
   3971                         reduce = True
-> 3972                     return self._apply_standard(f, axis, reduce=reduce)
   3973             else:
   3974                 return self._apply_broadcast(f, axis)

/.../Projects/Software/TimeTillComplete/venv/lib/python3.5/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4017             # Create a dummy Series from an empty array
   4018             index = self._get_axis(axis)
-> 4019             empty_arr = np.empty(len(index), dtype=values.dtype)
   4020             dummy = Series(empty_arr, index=self._get_axis(axis),
   4021                            dtype=values.dtype)

TypeError: data type not understood

I don't really known what's going on, but as a workaround you can get the expected output calling apply() on the column: 我真的不知道发生了什么,但是作为一种解决方法,您可以在列上调用apply()获得预期的输出:

dfY['predicted_time'].apply(lambda rr: print(rr))

EDIT Looks like you hit a bug in pandas. 编辑好像您遇到了一个熊猫中的错误。 The issue is triggered by using time zone aware timestamps in a dataframe. 通过在数据帧中使用时区感知时间戳来触发此问题。 Using a series works as seen above. 如上所示,使用系列作品。 Using naive timestamps also works: 使用朴素的时间戳也可以:

df = pd.DataFrame(pd.Series(dfY['predicted_time'].values),
df.apply(lambda rr: print(rr['predicted_time']), 1)

