简体   繁体   English

Scala Apache Spark Cassandra表列表

[英]Scala apache spark cassandra table list

I want to list the tables of a keyspace in a Cassandra db using Apache Spark. 我想使用Apache Spark列出Cassandra数据库中键空间的表。 I can access any cassandra table with sc.cassandraTable("keyspace", "table") but I'm not able to list all the tables in a keyspace, I want to loop over them. 我可以使用sc.cassandraTable("keyspace", "table")访问任何cassandra表,但无法列出键空间中的所有表,我想遍历它们。 This is my code: 这是我的代码:

val conf = new SparkConf(true)
      .setAppName("Backup app").setMaster("local[4]")
      .set("spark.cassandra.connection.host", "XXXXX")
      .set("spark.cassandra.auth.username", "XXXX")
      .set("spark.cassandra.auth.password", "XXXXX")
      .setJars(Array("./lib/spark-cassandra-connector-assembly-2.0.2-39-g24f392d.jar"))

  val sc = new SparkContext(conf)

  sc.cassandraTable("keyspace", "userstable").select("salt").where("role = ?", "user").collect().toList.foreach {
    userkeyspace => println(userkeyspace)
  }

How can I do it? 我该怎么做?

I found the solution, here is the working code for me (I have a table stb.users where I store, among others, the keyspace of every user under the "salt" column): 我找到了解决方案,这是适合我的工作代码(我有一个表stb.users,其中包括“ salt”列下每个用户的键空间):

val conf = new SparkConf(true)
      .setAppName("Backup app").setMaster("local[4]")
      .set("spark.cassandra.connection.host", "XXXX")
      .set("spark.cassandra.auth.username", "XXXX")
      .set("spark.cassandra.auth.password", "XXXX")
      .setJars(Array("./lib/spark-cassandra-connector-assembly-2.0.2-39-g24f392d.jar"))

  val sc = new SparkContext(conf)
  val sqlContext = new org.apache.spark.sql.SQLContext(sc);

  CassandraConnector(conf).withSessionDo { session =>
    sc.cassandraTable("stb", "users").select("salt").where("role = ?", "user").collect().toList.foreach {
      user =>
        val userSalt = user.getString("salt")
        val iterator = session.getCluster.getMetadata.getKeyspace(userSalt).getTables().iterator()
        while(iterator.hasNext) {
          val tableName = iterator.next().getName
          println(keyspace + " " + tableName)
        }
    }
  }

You can use 您可以使用

system.schema_columnfamilies system.schema_columnfamilies

table, it contains list of tables in each keyspace. table,它包含每个键空间中的表的列表。

SELECT keyspace_name, columnfamily_name FROM schema_columnfamilies;

or 要么

sc.cassandraTable("system", "schema_columnfamilies").select("columnfamily_name").where("keyspace_name = ?", "the_keyspace")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM