繁体 English 中英

pyspark toPandas错误？

[英]pyspark toPandas Error?

原文 2017-06-06 11:11:17 9 2 python/ python-2.7/ pyspark/ spark-dataframe

我有一个混乱且非常大的数据集，其中包含汉字，数字，字符串，日期等。 在我使用pyspark进行了一些清洁并将其变成大熊猫之后，它引发了以下错误：
IOPub data rate exceeded. The notebook server will temporarily stop sending output to the client in order to avoid crashing it. To change this limit, set the config variable --NotebookApp.iopub_data_rate_limit. 17/06/06 18:48:54 WARN TaskSetManager: Lost task 8.0 in stage 13.0 (TID 393, localhost): TaskKilled (killed intentionally)

在错误之上，它输出了我的一些原始数据，这很长。 所以我只发布了一部分。

我已经检查了清理的数据。 所有列类型均为int ， double 。 为什么仍然输出我的旧数据？

2 个解决方案

尝试启动jupyter笔记本，将'iopub_data_rate_limit'增加为：

jupyter笔记本--NotebookApp.iopub_data_rate_limit = 10000000000

来源： https ： //github.com/jupyter/notebook/issues/2287

最好的方法是将其放在jupyterhub_config.py文件中：

c.Spawner.args = ['--NotebookApp.iopub_data_rate_limit=1000000000']

Pyspark芹菜任务：toPandas（）抛出酸洗错误

[英]Pyspark celery task : toPandas() throwing Pickling error

Pyspark toPandas（）NullPointerException？

[英]Pyspark toPandas() NullPointerException?

pyspark 方法 toPandas 内部

[英]pyspark the method toPandas internal

PySpark toPandas function 正在改变列类型

[英]PySpark toPandas function is changing column type

PySpark df.toPandas() throws error "org.apache.spark.util.TaskCompletionListenerException: Memory was leaked by query. Memory leaked: (376832)"

[英]PySpark df.toPandas() throws error "org.apache.spark.util.TaskCompletionListenerException: Memory was leaked by query. Memory leaked: (376832)"

Pyspark .toPandas（）导致对象列在预期的数字位置

[英]Pyspark .toPandas() results in object column where expected numeric one

toPandas() 会随着 pyspark 数据框变小而加快速度吗？

[英]Does toPandas() speed up as a pyspark dataframe gets smaller?

关于单元测试中 PySpark toPandas() 未关闭套接字的 ResourceWarning

[英]ResourceWarning about unclosed socket from PySpark toPandas() in unit tests

为什么 toPandas() 会抛出错误而 .show() 工作得很好？

[英]Why does toPandas() throw error while .show() works perfectly fine?

Spark 2.0 toPandas方法

[英]Spark 2.0 toPandas method

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pyspark芹菜任务：toPandas（）抛出酸洗错误 Pyspark toPandas（）NullPointerException？ pyspark 方法 toPandas 内部 PySpark toPandas function 正在改变列类型 PySpark df.toPandas() throws error "org.apache.spark.util.TaskCompletionListenerException: Memory was leaked by query. Memory leaked: (376832)" Pyspark .toPandas（）导致对象列在预期的数字位置 toPandas() 会随着 pyspark 数据框变小而加快速度吗？关于单元测试中 PySpark toPandas() 未关闭套接字的 ResourceWarning 为什么 toPandas() 会抛出错误而 .show() 工作得很好？ Spark 2.0 toPandas方法

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM