使用 spark 从 hbase 读取特定列数据

Question

I have a table in HBase named as "orders" it has column family 'o' and columns as {id,fname,lname,email} having row key as id.我在 HBase 中有一个名为“orders”的表，它的列族为“o”，列为 {id,fname,lname,email}，行键为 id。 I am trying to get the value of fname and email only from hbase using spark.我正在尝试使用 spark 仅从 hbase 获取 fname 和 email 的值。 Currently what 'i am doing is given below目前我在做什么在下面给出

   override def put(params: scala.collection.Map[String, Any]): Boolean = {
    var sparkConfig = new SparkConf().setAppName("Connector")
    var sc: SparkContext = new SparkContext(sparkConfig)
    var hbaseConfig = HBaseConfiguration.create()
    hbaseConfig.set("hbase.zookeeper.quorum", ZookeeperQourum)
    hbaseConfig.set("hbase.zookeeper.property.clientPort", zookeeperPort)
    hbaseConfig.set(TableInputFormat.INPUT_TABLE, schemdto.tableName);
    hbaseConfig.set(TableInputFormat.SCAN_COLUMNS, "o:fname,o:email");
    var hBaseRDD = sc.newAPIHadoopRDD(hbaseConfig, classOf[TableInputFormat],
      classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
      classOf[org.apache.hadoop.hbase.client.Result])
    try {
      hBaseRDD.map(tuple => tuple._2).map(result => result.raw())
        .map(f => KeyValueToString(f)).saveAsTextFile(sink)

      true
    } catch {
      case _: Exception => false
    }
}


def KeyValueToString(keyValues: Array[KeyValue]): String = {
    var it = keyValues.iterator
    var res = new StringBuilder
    while (it.hasNext) {
      res.append( Bytes.toString(it.next.getValue()) + ",")
    }
    res.substring(0, res.length-1);
}

But nothing is returned and If I try to fetch only one column such as但是没有返回任何内容，如果我尝试仅获取一列，例如

hbaseConfig.set(TableInputFormat.SCAN_COLUMNS, "o:fname");

then it returns all the values of column fname然后它返回列 fname 的所有值

So my question is how to get multiple columns from hbase using spark所以我的问题是如何使用 spark 从 hbase 获取多个列

Any help will be appreciated.任何帮助将不胜感激。

Answer 1

根据文档，要扫描的列列表需要以空格分隔。

hbaseConfig.set(TableInputFormat.SCAN_COLUMNS, "o:fname o:email");

使用 spark 从 hbase 读取特定列数据

问题描述

1 个解决方案

解决方案1
10 已采纳 2014-11-25 09:34:48

使用 spark 从 hbase 读取特定列数据

问题描述

1 个解决方案

解决方案1 10 已采纳 2014-11-25 09:34:48

解决方案1
10 已采纳 2014-11-25 09:34:48