简体   繁体   中英

How to do logging of spark Dataset printSchema in info/debug level in spark- java project

Trying to covert my spark scala project into spark-java project. I have a logging in scala as below

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

    class ClassName{
      val logger  = LoggerFactory.getLogger("ClassName")
      ...
      val dataframe1 = ....///read dataframe from text file.
      ...

      logger.debug("dataframe1.printSchema : \n " + dataframe1.printSchema; //this is working fine.
    }

Now I am trying to write it in java 1.8 as below

public class ClassName{

    public static final Logger logger  = oggerFactory.getLogger("ClassName"); 
      ...
     Dataset<Row> dataframe1 = ....///read dataframe from text file.
     ...

     logger.debug("dataframe1.printSchema : \n " + dataframe1.printSchema()); //this is not working 

}

I tried several ways but nothing worked to log printSchema in debug/info mode.

dataframe1.printSchema() // this actually returning void hence not able to append to string.

How actually logging is done spark-java production grade projects ? What is the best approach I need to follow to log in debugging?

How to handle the above scenario? ie log.debug( dataframe1.printSchema() ) in java ?

printSchema method already prints the schema to the console without returning it in any form. You can simply call the method and redirect console output somewhere else. There are other workarounds like this one .

You can use df.schema.treeString . This returns a string when compared to Unit() equivalent of Void in java returned by df.printSchema . This is true in Scala and I believe it is the same in Java.Let me know if that helps.

scala> val df = Seq(1, 2, 3).toDF()
df: org.apache.spark.sql.DataFrame = [value: int]

scala> val x = df.schema.treeString
x: String =
"root
 |-- value: integer (nullable = false)
"

scala> val y = df.printSchema
root
 |-- value: integer (nullable = false)

y: Unit = ()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM