簡體   English   中英

使用Spark Scala將時間戳記插入Cassandra

[英]Insert timestamp to Cassandra using Spark Scala

我正在嘗試讀取包含名稱的文件,並使用Spark和Scala將名稱以及時間戳數據插入cassandra表中。 下面是我的代碼

case class Names(name:String, auditDate:DateTime )

def main(args: Array[String]): Unit = {
    System.setProperty("hadoop.home.dir", "D:\\backup\\lib\\winutils");
    val conf = new SparkConf()
      .set("spark.cassandra.connection.host", "172.16.109.202")
      //.set("spark.cassandra.connection.host", "192.168.1.17")
      .setAppName("CassandraLoader")
      .setMaster("local")
    var context = new SparkContext(conf)

    var namesFile = context.textFile("src/main/resources/names.txt")

    namesFile.map(x=>Names(x,DateTime.now()))
      .saveToCassandra("practice","names",SomeColumns("name", "insert_date"))

  }

cassandra表的詳細信息在下面

CREATE TABLE practice.names (
    name text PRIMARY KEY,
    insert_date timestamp
)

當我嘗試執行代碼時,出現以下錯誤

Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Columns not found in com.sample.practice.Names: [insert_date]
    at scala.Predef$.require(Predef.scala:233)
    at com.datastax.spark.connector.mapper.DefaultColumnMapper.columnMapForWriting(DefaultColumnMapper.scala:108)
    at com.datastax.spark.connector.writer.MappedToGettableDataConverter$$anon$1.<init>(MappedToGettableDataConverter.scala:29)
    at com.datastax.spark.connector.writer.MappedToGettableDataConverter$.apply(MappedToGettableDataConverter.scala:20)
    at com.datastax.spark.connector.writer.DefaultRowWriter.<init>(DefaultRowWriter.scala:17)
    at com.datastax.spark.connector.writer.DefaultRowWriter$$anon$1.rowWriter(DefaultRowWriter.scala:31)
    at com.datastax.spark.connector.writer.DefaultRowWriter$$anon$1.rowWriter(DefaultRowWriter.scala:29)
    at com.datastax.spark.connector.writer.TableWriter$.apply(TableWriter.scala:271)
    at com.datastax.spark.connector.RDDFunctions.saveToCassandra(RDDFunctions.scala:36)
    at com.sample.practice.CqlInsertDate$.main(CqlInsertDate.scala:30)
    at com.sample.practice.CqlInsertDate.main(CqlInsertDate.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)

當我嘗試打印RDD而不是保存到cassandra時,我得到以下輸出

Names(Frank,2017-01-30T14:03:16.911+05:30)
Names(Jean,2017-01-30T14:03:17.115+05:30)
Names(Joe,2017-01-30T14:03:17.116+05:30)

以下是我的SBT文件詳細信息

version := "1.0"

scalaVersion := "2.10.6"

libraryDependencies += "com.datastax.spark" % "spark-cassandra-connector_2.10" % "2.0.0-M3"

libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "2.0.2"

libraryDependencies += "org.apache.spark" % "spark-sql_2.10" % "2.0.2"

libraryDependencies += "org.apache.spark" % "spark-hive_2.10" % "2.0.2"

我正在使用Cassandra 2.1。 請幫忙。 提前致謝。

嘗試將您的類字段更改為insert_date,反之亦然,將表列更改為auditDate

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM