简体   繁体   中英

How to import a PySpark dataframe from one Jupyter Notebook to another without converting it to csv?

Let's say that I have a dataframe called spark_df in a Notebook called Notebook1 and I want to transfer it to a Notebook called Notebook2. Obviously I can't do "from Notebook1.ipynb import spark_df" and I can't convert it to csv because 1) it's too big and 2) I need a more direct approach.

I need to import it to another Notebook because after finishing processing and I try to do something, the kernel dies. So how can I import the spark_df to Notebook2 without converting it to csv?

Since your csv is too big to move in and out of disk, you can stream data from one spark job to another. See Structured Streaming Programming Guide

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM