![](/img/trans.png)
[英]Modin with ray for pandas working in command prompt but not on Idle, no error code
[英]Pandas Modin ray library fails to startup
我正在尝试使用modin加速我的 pandas 数据处理
import os
os.environ["MODIN_ENGINE"] = "ray"
import modin.pandas as pd
df = pd.read_csv(r"C:\Users\Harshad\Documents\Files\Data\Pre-processed\data.csv", low_memory=False)
我收到以下警告和错误:
UserWarning: Ray execution environment not yet initialized. Initializing...
To remove this warning, run the following python code before doing dataframe operations:
import ray
ray.init()
Traceback (most recent call last):
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\node.py", line 240, in __init__
self.redis_password)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\_private\services.py", line 328, in wait_for_node
raise TimeoutError("Timed out while waiting for node to startup.")
TimeoutError: Timed out while waiting for node to startup.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Harshad/Documents/Code/data.py", line 18, in <module>
low_memory=False)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\io.py", line 135, in read_csv
return _read(**kwargs)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\io.py", line 58, in _read
Engine.subscribe(_update_engine)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\config\pubsub.py", line 213, in subscribe
callback(cls)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\__init__.py", line 127, in _update_engine
initialize_ray()
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\core\execution\ray\common\utils.py", line 185, in initialize_ray
ray.init(**ray_init_kwargs)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\_private\client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\worker.py", line 922, in init
ray_params=ray_params)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\node.py", line 243, in __init__
"The current node has not been updated within 30 "
Exception: The current node has not been updated within 30 seconds, this could happen because of some of the Ray processes failed to startup.
虽然我已经清楚地重新运行代码,但它们之间的时间间隔超过 30 秒。
当我在安装 modin 和 ray 后第一次运行它时,它运行得相当好,只有以下警告:
UserWarning: Ray execution environment not yet initialized. Initializing...
To remove this warning, run the following python code before doing dataframe operations:
import ray
ray.init()
然后我将代码修改为:
import os
os.environ["MODIN_ENGINE"] = "ray"
import modin.pandas as pd
import ray
ray.init()
df = pd.read_csv(r"C:\Users\Harshad\Documents\Files\Data\Pre-processed\data.csv", low_memory=False)
我收到此错误:
Traceback (most recent call last):
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\node.py", line 240, in __init__
self.redis_password)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\_private\services.py", line 328, in wait_for_node
raise TimeoutError("Timed out while waiting for node to startup.")
TimeoutError: Timed out while waiting for node to startup.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Harshad/Documents/Code/data.py", line 18, in <module>
low_memory=False)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\io.py", line 135, in read_csv
return _read(**kwargs)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\io.py", line 58, in _read
Engine.subscribe(_update_engine)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\config\pubsub.py", line 213, in subscribe
callback(cls)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\__init__.py", line 127, in _update_engine
initialize_ray()
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\core\execution\ray\common\utils.py", line 185, in initialize_ray
ray.init(**ray_init_kwargs)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\_private\client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\worker.py", line 922, in init
ray_params=ray_params)
File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\node.py", line 243, in __init__
"The current node has not been updated within 30 "
Exception: The current node has not been updated within 30 seconds, this could happen because of some of the Ray processes failed to startup
当我查看这个问题的 Github 时,发现它是一个错误
如何解决这些警告和错误?
编辑:我重新启动了我的 pycharm 环境,它允许重新运行一个周期。 这表明它是 Pycharm/环境问题?
我该如何解决这个问题?
在 import modin
之前尝试init
ing ray
:
import os
os.environ["MODIN_ENGINE"] = "ray"
import ray
ray.init()
import modin.pandas as pd
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.