简体   繁体   中英

pySpark check if dataframe exists

Is there a way to check if a dataframe exists in pySpark?

I know in native python, to check if dataframe exists:

exists(df_name) && is.data.frame(get(df_name))

How can this be done in pySpark? Since command exists throws an error.

It is same as Petel code. You can import the dataframe type.

 from pyspark.sql import DataFrame

 df= sc.parallelize([
 (1,2,3), (4,5,7)]).toDF(["a", "b", "c"])

 if df is not None and isinstance(df,DataFrame):
      #<some operation>
      print("dataframe exists")

try this: df_name is not None and isinstance(df_name, DataFrame)

I think you want to know if df_name is defined and pointing to a DataFrame . None of the answers above handle the case where df_name is not set. This does:

from pyspark.sql import DataFrame

try:
  if df_name is not None and isinstance(df_name,DataFrame):
    print('df_name exists')

except NameError as error:
  print('df_name does not exist and not defined')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM