[英]Problems using Hive + Cassandra community
我正在嘗試使用HIVE 0.13來訪問用CQL3創建的cassandra 2.0.8列系列。
這是我創建列族的方式:
CREATE KEYSPACE IF NOT EXISTS Identification
WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',
'DC1' : 2 };
USE Identification;
CREATE TABLE IF NOT EXISTS entitylookup (
name varchar,
value varchar,
entity_id uuid,
PRIMARY KEY ((name, value), entity_id))
WITH
caching=all
;
我遵循了該項目自述文件中的指示: https : //github.com/tuplejump/cash/tree/master/cassandra-handler
我生成了hive-cassandra-1.2.6.jar,將其復制並將cassandra-all-1.2.6.jar,cassandra-thrift-1.2.6.jar復制到hive lib文件夾。
然后,我開始配置單元並嘗試以下操作:
CREATE EXTERNAL TABLE identification.entitylookup(name string, value string, entity_id binary)
STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' WITH SERDEPROPERTIES("cql.primarykey" = "name, value", "cassandra.host" = "localhost", "cassandra.port "= "9160")
TBLPROPERTIES ("cassandra.ks.name" = "identification", "cassandra.ks.stratOptions"="'DC1':2", "cassandra.ks.strategy"="NetworkTopologyStrategy");
這是輸出:
hive> mvalle@mvalle:~/hadoop$ hive
14/05/30 12:02:02 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/05/30 12:02:02 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/05/30 12:02:02 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
14/05/30 12:02:02 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
14/05/30 12:02:02 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/05/30 12:02:02 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
14/05/30 12:02:02 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/05/30 12:02:02 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
Logging initialized using configuration in jar:file:/home/mvalle/hadoop/apache-hive-0.13.0-bin/lib/hive-common-0.13.0.jar!/hive-log4j.properties
OpenJDK 64-Bit Server VM warning: You have loaded library /home/mvalle/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
hive> CREATE EXTERNAL TABLE identification.entitylookup(name string, value string, entity_id binary)
> STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' WITH SERDEPROPERTIES("cql.primarykey" = "name, value", "cassandra.host" = "ident.s1mbi0se.com", "cassandra.port "= "9160")
> TBLPROPERTIES ("cassandra.ks.name" = "identification", "cassandra.ks.stratOptions"="'DC1':2", "cassandra.ks.strategy"="NetworkTopologyStrategy");
FAILED: SemanticException [Error 10072]: Database does not exist: identification
問題:如何獲得更多有關發生問題的信息? 我使用“標識”(大寫I)嘗試了相同的蜂巢突擊隊,但結果相同。 是否可以在cassandra社區中訪問CQL3色譜柱族? 似乎鍵空間尚未映射,但是那時我不知道如何映射。 在DSE中,它們會自動映射...
編輯:
為了進一步說明,如果我創建一個空數據庫,然后嘗試創建外部表,這是我得到的:
hive> create database identification;
OK
Time taken: 0.154 seconds
hive> CREATE EXTERNAL TABLE identification.entity_lookup(name string, value string, entity_id binary)
> STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' WITH SERDEPROPERTIES("cql.primarykey" = "name, value", "cassandra.host" = "localhost", "cassandra.port "= "9160")
> TBLPROPERTIES ("cassandra.ks.name" = "identification", "cassandra.ks.stratOptions"="'DC1':3", "cassandra.ks.strategy"="NetworkTopologyStrategy");
OK
Time taken: 3.58 seconds
hive> select * from identification.entity_lookup limit 10;
OK
Exception in thread "main" java.lang.InstantiationError: org.apache.hadoop.mapreduce.JobContext
at org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:166)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:418)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:561)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:534)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:137)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1488)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:285)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
錯誤的原因不是Cash不能映射鍵空間,而是因為蜂巢中沒有數據庫。
只需在蜂巢中使用創建數據庫,
CREATE DATABASE identification;
那應該使它工作。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.