用熊猫在一列中减去每两行

Question

I found this problem bellow while executing the code bellow on google colab it works normaly我在google colab上执行下面的代码时发现了这个问题，它可以正常工作

df['temps'] = df['temps'].view(int).div(1e9).diff().fillna(0).abs()
print(df)

but while using jupyter notebook localy the error bellow appears但是在本地使用 jupyter notebook 时出现以下错误

ValueError                                Traceback (most recent call last)
Input In [13], in <cell line: 1>()
----> 1 df3['rebounds'] = pd.Series(df3['temps'].view(int).div(1e9).diff().fillna(0))

File C:\Python310\lib\site-packages\pandas\core\series.py:818, in Series.view(self, dtype)
    815 # self.array instead of self._values so we piggyback on PandasArray
    816 #  implementation
    817 res_values = self.array.view(dtype)
--> 818 res_ser = self._constructor(res_values, index=self.index)
    819 return res_ser.__finalize__(self, method="view")

File C:\Python310\lib\site-packages\pandas\core\series.py:442, in Series.__init__(self, data, index, dtype, name, copy, fastpath)
    440     index = default_index(len(data))
    441 elif is_list_like(data):
--> 442     com.require_length_match(data, index)
    444 # create/copy the manager
    445 if isinstance(data, (SingleBlockManager, SingleArrayManager)):

File C:\Python310\lib\site-packages\pandas\core\common.py:557, in require_length_match(data, index)
    553 """
    554 Check the length of data matches the length of the index.
    555 """
    556 if len(data) != len(index):
--> 557     raise ValueError(
    558         "Length of values "
    559         f"({len(data)}) "
    560         "does not match length of index "
    561         f"({len(index)})"
    562     )

ValueError: Length of values (830) does not match length of index (415)

any suggetions to resolve this !!任何解决此问题的建议！

Answer 1

Here are two ways to get this to work:这里有两种方法可以让它工作：

df3['rebounds'] = pd.Series(df3['temps'].view('int64').diff().fillna(0).div(1e9))

... or: ... 或者：

df3['rebounds'] = pd.Series(df3['temps'].astype('int64').diff().fillna(0).div(1e9))

For the following sample input:对于以下示例输入：

df3.dtypes:

temps    datetime64[ns]
dtype: object

df3:

       temps
0 2022-01-01
1 2022-01-02
2 2022-01-03

... both of the above code samples give this output: ...上述两个代码示例都给出了以下输出：

df3.dtypes:

temps       datetime64[ns]
rebounds           float64
dtype: object

df3:

       temps  rebounds
0 2022-01-01       0.0
1 2022-01-02   86400.0
2 2022-01-03   86400.0

The issue is probably that view() essentially reinterprets the raw data of the existing series as a different data type.问题可能在于view()本质上将现有系列的原始数据重新解释为不同的数据类型。 For this to work, according to the Series.view() docs (see also the numpy.ndarray.view() docs ) the data types must have the same number of bytes.为此，根据Series.view() 文档（另请参见numpy.ndarray.view() 文档），数据类型必须具有相同的字节数。 Since the original data is datetime64 , your code specifying int as the argument to view() may not have met this requirement.由于原始数据是datetime64 ，您将int指定为 view() 的参数的代码可能不满足此要求。 Explicitly specifying int64 should meet it.明确指定int64应该满足它。 Or, using astype() instead of view() with int64 will also work.或者，使用astype()而不是view()和 int64 也可以。

As to why this works in colab and not in jupyter notebook, I can't say.至于为什么这在 colab 而不是 jupyter notebook 中有效，我不能说。 Perhaps they are using different versions of pandas and numpy which treat int differently.也许他们正在使用不同版本的 pandas 和 numpy，它们对int的处理方式不同。

I do know that in my environment, if I try the following:我确实知道在我的环境中，如果我尝试以下操作：

df3['rebounds'] = pd.Series(df3['temps'].astype('int').diff().fillna(0).div(1e9))

... then I get this error: ...然后我收到此错误：

TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]

This suggests that int means int32 .这表明int意味着int32 。 It would be interesting to see if this works on colab.看看这是否适用于 colab 会很有趣。

用熊猫在一列中减去每两行

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-05-15 11:55:57

用熊猫在一列中减去每两行

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-05-15 11:55:57

解决方案1
1 已采纳 2022-05-15 11:55:57