[英]substract each two row in one column with pandas
I found this problem bellow while executing the code bellow on google colab it works normaly我在google colab上执行下面的代码时发现了这个问题,它可以正常工作
df['temps'] = df['temps'].view(int).div(1e9).diff().fillna(0).abs()
print(df)
but while using jupyter notebook localy the error bellow appears但是在本地使用 jupyter notebook 时出现以下错误
ValueError Traceback (most recent call last)
Input In [13], in <cell line: 1>()
----> 1 df3['rebounds'] = pd.Series(df3['temps'].view(int).div(1e9).diff().fillna(0))
File C:\Python310\lib\site-packages\pandas\core\series.py:818, in Series.view(self, dtype)
815 # self.array instead of self._values so we piggyback on PandasArray
816 # implementation
817 res_values = self.array.view(dtype)
--> 818 res_ser = self._constructor(res_values, index=self.index)
819 return res_ser.__finalize__(self, method="view")
File C:\Python310\lib\site-packages\pandas\core\series.py:442, in Series.__init__(self, data, index, dtype, name, copy, fastpath)
440 index = default_index(len(data))
441 elif is_list_like(data):
--> 442 com.require_length_match(data, index)
444 # create/copy the manager
445 if isinstance(data, (SingleBlockManager, SingleArrayManager)):
File C:\Python310\lib\site-packages\pandas\core\common.py:557, in require_length_match(data, index)
553 """
554 Check the length of data matches the length of the index.
555 """
556 if len(data) != len(index):
--> 557 raise ValueError(
558 "Length of values "
559 f"({len(data)}) "
560 "does not match length of index "
561 f"({len(index)})"
562 )
ValueError: Length of values (830) does not match length of index (415)
any suggetions to resolve this !!任何解决此问题的建议!
Here are two ways to get this to work:这里有两种方法可以让它工作:
df3['rebounds'] = pd.Series(df3['temps'].view('int64').diff().fillna(0).div(1e9))
... or: ... 或者:
df3['rebounds'] = pd.Series(df3['temps'].astype('int64').diff().fillna(0).div(1e9))
For the following sample input:对于以下示例输入:
df3.dtypes:
temps datetime64[ns]
dtype: object
df3:
temps
0 2022-01-01
1 2022-01-02
2 2022-01-03
... both of the above code samples give this output: ...上述两个代码示例都给出了以下输出:
df3.dtypes:
temps datetime64[ns]
rebounds float64
dtype: object
df3:
temps rebounds
0 2022-01-01 0.0
1 2022-01-02 86400.0
2 2022-01-03 86400.0
The issue is probably that view()
essentially reinterprets the raw data of the existing series as a different data type.问题可能在于
view()
本质上将现有系列的原始数据重新解释为不同的数据类型。 For this to work, according to the Series.view()
docs (see also the numpy.ndarray.view()
docs ) the data types must have the same number of bytes.为此,根据
Series.view()
文档(另请参见numpy.ndarray.view()
文档),数据类型必须具有相同的字节数。 Since the original data is datetime64
, your code specifying int
as the argument to view() may not have met this requirement.由于原始数据是
datetime64
,您将int
指定为 view() 的参数的代码可能不满足此要求。 Explicitly specifying int64
should meet it.明确指定
int64
应该满足它。 Or, using astype()
instead of view()
with int64 will also work.或者,使用
astype()
而不是view()
和 int64 也可以。
As to why this works in colab and not in jupyter notebook, I can't say.至于为什么这在 colab 而不是 jupyter notebook 中有效,我不能说。 Perhaps they are using different versions of pandas and numpy which treat
int
differently.也许他们正在使用不同版本的 pandas 和 numpy,它们对
int
的处理方式不同。
I do know that in my environment, if I try the following:我确实知道在我的环境中,如果我尝试以下操作:
df3['rebounds'] = pd.Series(df3['temps'].astype('int').diff().fillna(0).div(1e9))
... then I get this error: ...然后我收到此错误:
TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]
This suggests that int
means int32
.这表明
int
意味着int32
。 It would be interesting to see if this works on colab.看看这是否适用于 colab 会很有趣。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.