简体   繁体   English

WSO2BAM REST流输入到BAM / Cassandra; 无法使用配置单元查询获取EVENT_KS数据?

[英]WSO2BAM REST stream input to BAM/Cassandra; can't get to the EVENT_KS data using hive query?

The background for this question is essentially an article written by Sachini Jayasekara @ WSO2 called Using Different Reporting Frameworks with WSO2 Business Activity Monitor . 该问题的背景本质上是Sachini Jayasekara @ WSO2 撰写一篇文章,文章称为《 将不同的报告框架与WSO2业务活动监视器一起使用》 I am doing more or less exactly the same, but using rather the REST API to define a data stream and invoke the REST WS API to push data into BAM. 我或多或少地完全一样,只是使用REST API定义数据流并调用REST WS API将数据推送到BAM中。 Then use the HIVE queries to get to the data. 然后使用HIVE查询获取数据。 However, it seems that I have missed something, as the attribute data is not shown. 但是,似乎我错过了一些东西,因为未显示属性数据。 Hence the query. 因此查询。

Currently using the REST api which is invoked through a Perl based daemon. 当前使用通过基于Perl的守护程序调用的REST api。 This invokes the REST API using the following streams definition and payload: 这将使用以下流定义和有效负载来调用REST API:

{
  "name":"currentcostRealtime2.stream",
  "version": "1.0.6",
  "nickName": "Currentcost Realtime",
  "description": "This is the Currentcost realtime stream",
  "payloadData":[
    {
      "name":"sensor",
      "type":"INT"
    },
    {
      "name":"temp",
      "type":"FLOAT"
    },
    {
      "name":"timestamp",
      "type":"STRING"
    },
    {
      "name":"watt",
      "type":"INT"
    }
  ]
}

.. and payload definition .. ..和有效载荷定义..

[
 {
   "payloadData" : [SENSOR, TEMP, "TIMESTAMP", WATT] ,
 }
]

I should note that the payload is string replaced before its committed; 我应该注意,有效载荷在提交之前已被字符串替换; eg the actual payload that is committed looks like: 例如,提交的实际有效负载如下所示:

[
 {
   "payloadData" : [1, 18.7, "2014-06-15 16:15:56", 1] ,
 }
]

The queries execute with no apparent problem, but I am having now an issue with the HIVE query in BAM, which gives me entries output, but not the values. 查询执行没有明显问题,但是我现在在BAM中遇到了HIVE查询问题,这使我可以输入条目,但不能输出值。 Eg trying to now execute the following HIVE query: 例如,现在尝试执行以下HIVE查询:

CREATE TABLE IF NOT EXISTS CurrentCostDataTemp ( sensor INT, temp FLOAT, ts TIMESTAMP, watt INT ) 
STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
WITH SERDEPROPERTIES ( "cassandra.host" = "127.0.0.1",
    "cassandra.port" = "9160",
    "cassandra.ks.name" = "EVENT_KS",
    "cassandra.ks.username" = "admin",
    "cassandra.ks.password" = "admin",
    "cassandra.cf.name" = "currentcostRealtime2_stream",
    "cassandra.columns.mapping" = "payload_sensor, payload_temp, payload_timestamp, payload_watt" );

select * from CurrentCostDataTemp;                                  

.. but this gives only the following (see specific picture below) - eg that there is NO attribute level data that is shown. ..但这仅给出以下内容(请参见下面的特定图片)-例如,没有显示的属性级别数据。 However, it is evident that there are EVENT_KS entries given it outputs 4 rows.. so question is how do I reference the data to extract the values, or is there something else going on here that I am not aware of?: 但是,很明显,有给定的EVENT_KS条目输出了4行..所以问题是我如何引用数据以提取值,或者还有其他我不知道的事情?

key sensor  temp    ts  watt
1402816273765::192.168.1.106::9443::52              
1402815283659::192.168.1.106::9443::51              
1402815238323::192.168.1.106::9443::49              
1402815280532::192.168.1.106::9443::50              

Have verified that the data is in Cassandra by checking with Cqlsh - see here: 通过使用Cqlsh检查已验证数据在Cassandra中-参见此处:

cqlsh:EVENT_KS> select * from "currentcostRealtime_stream";

 key                                    | Description                             | Name                       | Nick_Name            | StreamId                         | Timestamp     | Version | meta_ipAdd | payload_sensor | payload_temp | payload_timestamp   | payload_watt
----------------------------------------+-----------------------------------------+----------------------------+----------------------+----------------------------------+---------------+---------+------------+----------------+--------------+---------------------+--------------
 1402815283659::192.168.1.106::9443::51 | This is the Currentcost realtime stream | currentcostRealtime.stream | Currentcost Realtime | currentcostRealtime.stream:1.0.5 | 1402815283659 |   1.0.5 |       null |              1 |         18.7 | 2014-06-15 14:54:43 |            1
 1402815238323::192.168.1.106::9443::49 | This is the Currentcost realtime stream | currentcostRealtime.stream | Currentcost Realtime | currentcostRealtime.stream:1.0.5 | 1402815238323 |   1.0.5 |       null |              1 |         18.7 | 2014-06-15 14:53:58 |            1
 1402815280532::192.168.1.106::9443::50 | This is the Currentcost realtime stream | currentcostRealtime.stream | Currentcost Realtime | currentcostRealtime.stream:1.0.5 | 1402815280532 |   1.0.5 |       null |              1 |         18.7 | 2014-06-15 14:54:40 |            1
 1402816273765::192.168.1.106::9443::52 | This is the Currentcost realtime stream | currentcostRealtime.stream | Currentcost Realtime | currentcostRealtime.stream:1.0.5 | 1402816273765 |   1.0.5 |       null |              1 |         18.7 | 2014-06-15 15:11:13 |            1

(4 rows)

cqlsh:EVENT_KS>

Most likely a minor issue only that I have overseen, but would be great if someone else have seen this and could respond as well.. 很可能只是我所监督的一个小问题,但是如果其他人看到了这个问题并且也可以做出回应,那将是一个很大的问题。

When adding in a remote table definition to MySQL DB externally, the tables and all are created, but it seems like the problem is getting to the attribute data in the EVENT_KS table itself, and having that created and accessed through the HIVE script. 在外部将远程表定义添加到MySQL DB时,将创建表和所有表,但似乎问题在于要获取EVENT_KS表本身中的属性数据,并通过HIVE脚本创建和访问该属性数据。

Thanks in advance! 提前致谢!

/Jorgen /约根

[UPDATE - Thursday 19th - SOLVED] Got it working with a few hints to this question. [更新-19日星期四-已解决]可以解决此问题,并提供了一些提示。 The following code works fine now, which is great.. greatly appreciated for the time to respond from you guys.. 下面的代码现在可以正常工作了,这很好..非常感谢您抽出宝贵的时间来回复大家。

drop table CurrentCostDataTemp10;
drop table CurrentCostDataTemp_Summary10;

CREATE EXTERNAL TABLE IF NOT EXISTS CurrentCostDataTemp10 ( messageRowID STRING, payload_sensor INT, messageTimestamp BIGINT, payload_temp FLOAT, payload_timestamp BIGINT, payload_timestampmysql STRING, payload_watt INT ) 
STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
WITH SERDEPROPERTIES ( "cassandra.host" = "127.0.0.1",
  "cassandra.port" = "9160",
  "cassandra.ks.name" = "EVENT_KS",
  "cassandra.ks.username" = "<USER>",
  "cassandra.ks.password" = "<PASSWORD>",
  "cassandra.cf.name" = "currentcostsimple5_stream",
  "cassandra.columns.mapping" = ":key, payload_sensor, Timestamp, payload_temp, payload_timestamp, payload_timestampmysql, payload_watt" );

CREATE EXTERNAL TABLE IF NOT EXISTS CurrentCostDataTemp_Summary10 ( messageRowID STRING, payload_sensor INT, messageTimestamp BIGINT, payload_temp FLOAT, payload_timestamp BIGINT, payload_timestampmysql STRING, payload_watt INT ) 
STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler'
TBLPROPERTIES (
  'mapred.jdbc.driver.class' = 'com.mysql.jdbc.Driver',
  'mapred.jdbc.url' = 'jdbc:mysql://127.0.0.1:8889/currentcost' ,
  'mapred.jdbc.username' = '<USER>',
  'mapred.jdbc.password' = '<PASSWORD>',
  'hive.jdbc.update.on.duplicate'= 'true',
  'hive.jdbc.primary.key.fields' = 'messageRowID',
  'hive.jdbc.table.create.query' = 'CREATE TABLE CurrentCostDataTemp1 ( messageRowID VARCHAR(100) NOT NULL PRIMARY KEY, payload_sensor TINYINT(4), messageTimestamp BIGINT, payload_temp FLOAT, payload_timestamp BIGINT, payload_timestampmysql DATETIME, payload_watt INT ) ');

insert overwrite table CurrentCostDataTemp_Summary10 select messageRowID, payload_sensor, messageTimestamp, payload_temp, payload_timestamp, payload_timestampmysql, payload_watt FROM CurrentCostDataTemp10;

Using Different Reporting Frameworks with WSO2 Business Activity Monitor. 将不同的报告框架与WSO2业务活动监视器配合使用。 By Sachini Jayasekara 萨基尼·贾亚塞卡拉

Try changing the 1st line of the script as follows. 尝试如下更改脚本的第一行。

CREATE EXTERNAL TABLE IF NOT EXISTS CurrentCostDataTemp ( key STRING , sensor INT, temp FLOAT, ts TIMESTAMP, watt INT) 如果不存在,则创建EXTERNAL表CurrentCostDataTemp( key STRING ,传感器INT,温度FLOAT,ts TIMESTAMP,瓦特INT)

(Remove key STRING part if it gives errors.) (如果出现错误,请删除key STRING部分。)

Note: May be you will have to run DROP TABLE CurrentCostDataTemp before running above, in case it is already created, when you run it before. 注意:如果您已经创建了DROP TABLE CurrentCostDataTemp ,则可能需要在运行之前运行它。

I have amended your query as follows. 我已将您的查询修改如下。 Please try with that. 请尝试一下。

CREATE external TABLE IF NOT EXISTS CurrentCostDataTemp ( key string, sensor INT, temp FLOAT, ts TIMESTAMP, watt INT ) 
STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
WITH SERDEPROPERTIES ( "cassandra.host" = "127.0.0.1",
    "cassandra.port" = "9160",
    "cassandra.ks.name" = "EVENT_KS",
    "cassandra.ks.username" = "admin",
    "cassandra.ks.password" = "admin",
    "cassandra.cf.name" = "currentcostRealtime2_stream",
    "cassandra.columns.mapping" = ":key,payload_sensor, payload_temp, payload_timestamp, payload_watt" );

select * from CurrentCostDataTemp;  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM