簡體   English   中英

不可散列類型:將 modin 與 pandas 一起使用時的系列?

[英]Unhashable type: Series when using modin with pandas?

我在 Anaconda 上 Windows 10; 我通過以下方式安裝:

conda install -c anaconda dask
conda install -c conda-forge modin
conda update conda
conda update anaconda
conda update dask
conda install -c conda-forge pandas=1.0.5  # this will also download modin 0.7.4-py_0 --> 0.8.0-py_0

因此,請考慮以下示例:

#!/usr/bin/env python3

import io

USEDASK=False

if not USEDASK:
  import pandas as pd
else:
  from dask.distributed import Client # SO:48067066
  client = Client(processes=False)  # create scheduler and worker automatically
  #os.environ["MODIN_ENGINE"] = "dask"  # Modin will use Dask
  import modin.pandas as pd

my_csv_str = """Time[s], Channel 0
0.000000000000000, -0.736680805683136
0.000008000000000, -0.726485192775726
0.000016000000000, -0.721387386322021
0.000024000000000, -0.711191773414612
0.000032000000000, -0.700996160507202
0.000040000000000, -0.690800547599792
0.000048000000000, -0.670409321784973
0.000056000000000, -0.655115902423859
"""
my_csv_io = io.StringIO()
my_csv_io.write(my_csv_str)
my_csv_io.seek(0)

my_df = pd.read_csv(my_csv_io)
my_df.index = pd.to_timedelta(my_df.iloc[:,0], unit='s')
print(my_df)

當我有USEDASK=False時,一切都按預期工作。

當我有USEDASK=True時,出現以下故障:

python test\test.py
UserWarning: The Dask Engine for Modin is experimental.
UserWarning: Parameters provided defaulting to pandas implementation.
To request implementation, send an email to feature_requests@modin.org.
Traceback (most recent call last):
  File "test\test.py", line 30, in <module>
    my_df.index = pd.to_timedelta(my_df.iloc[:,0], unit='s')
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\tools\timedeltas.py", line 102, in to_timedelta
    return _convert_listlike(arg, unit=unit, errors=errors)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\tools\timedeltas.py", line 140, in _convert_listlike
    value = sequence_to_td64ns(arg, unit=unit, errors=errors, copy=False)[0]
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\arrays\timedeltas.py", line 961, in sequence_to_td64ns
    data[mask] = iNaT
  File "C:\ProgramData\Anaconda3\lib\site-packages\modin\pandas\series.py", line 337, in __setitem__
    if key not in self.keys():
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\range.py", line 338, in __contains__
    hash(key)
TypeError: unhashable type: 'Series'

有什么方法可以使用 modin+dask 使這段代碼正常工作嗎?

這不是一個好的解決方案,但應該使用一種解決方法:

my_df.index = pd.to_timedelta(my_df.iloc[:,0].values, unit='s')

這適用於USEDASK是 True 還是 False

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM