简体   繁体   中英

Lazy cassandra load with spark

I want to know if is a good practice to load a cassandra table in a Lazy mode for then use a where clause.

For example:

Lazy val table = sparkContext.cassandraTable[Type](keyspace,tableName)

---other part of the code---

table.where("column = ?",param)

Thanks!

All RDD's are lazy by default. They won't actually do anything until you call an action. So don't add lazy as this will just delay the creation of the metadata around your RDD and not actually effect execution.

Example

val table = sparkContext.cassandraTable[Type](keyspace,tableName)
val tableWithWhere = table.where("x = 5")
val tableTransformed = table.map( x:Type => turnXIntoY(x) )
//nothing has happened in C* or Spark on executors yet
tableTransformed.collect // This causes spark to start doing work

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM