简体   繁体   中英

Spark-Scala with Cassandra

I am beginner with Spark, Scala and Cassandra. I am working with ETL programming. Now my project ETL POCs required Spark, Scala and Cassandra. I configured Cassandra with my ubuntu system in /usr/local/Cassandra/* and after that I installed Spark and Scala. Now I am using Scala editor to start my work, I created simply load a file in landing location, but after that I am trying to connect with cassandra in scala but I am not getting an help how we can connect and process the data in destination database?.

Any one help me Is this correct way? or some where I am wrong? please help me to how we can achieve this process with above combination.

Thanks in advance!

您可以使用spark-cassandra-connector轻松执行此操作

Add spark-cassandra-connector to your pom or sbt by reading instruction, then work this way

Import this in your file

import org.apache.spark.sql.SparkSession
import org.apache.spark.SparkConf
import org.apache.spark.sql.cassandra._

spark scala file

object SparkCassandraConnector {
def main(args: Array[String]) {

val conf = new SparkConf(true)
  .setAppName("UpdateCassandra")
  .setMaster("spark://spark:7077") // spark server
  .set("spark.cassandra.input.split.size_in_mb","67108864")
  .set("spark.cassandra.connection.host", "192.168.3.167") // cassandra host
  .set("spark.cassandra.auth.username", "cassandra")
  .set("spark.cassandra.auth.password", "cassandra")

// connecting with cassandra for spark and sql query
val spark = SparkSession.builder()
  .config(conf)
  .getOrCreate()

//    Load data from node publish table
val df = spark
  .read
  .cassandraFormat( "table_nmae",  "keyspace_name")
  .load()
 }
}

This will work for spark 2.2 and cassandra 2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM