简体   繁体   中英

using caseclass versus structtype in spark scala

When should I use Structtype, and when should I use case class. I am trying to create a spark dataset. I have an input CSV file, I am trying to create a dataframe first and then convert it to the dataset using df.as[]. Now in order to generate the schema, should I use structtype or case class? Please help.

You don't have to use StructType when reading your CSV file but :

  • By default all fields would be Strings unless you specify the inferschema option
  • You'd have to name every field like this if you don't have a header

    sparkSession.read.csv("my/csv/path.csv").toDF("id","product","customer","time").as[Transaction]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM