简体   繁体   中英

How to Use DataFrame Created in Scala in Databricks' PySpark

My Databricks notebook is on Python. Some codes in the notebook are written in Scala (using the %scala) and one of them is for creating dataframe.

If I use Python/PySpark (the default mode) again, how can I use / access this dataframe that was created when it was on scala mode?

Is it even possible?

Thanks

You can access DataFrames created in one language with another language through temp tables in SparkSQL.

For instance, say you have a DataFarame in scala called scalaDF . You can create a temporary view of that and make it accessible to a Python cell, for instance:

scalaDF.createOrReplaceTempView("my_table")

Then in a Python cell you can run

pythonDF = spark.sql("select * from my_table")

pythonDF.show()

The same works for passing dataframes between those languages and R. The common construct is a SparkSQL table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM