简体   繁体   中英

KSQL create stream from JSON fields with periods (`.` dot notation)

I have a json like {"destination.port":"443","network.packets":"4464","event.end":"2019-07-19T07:47:22.000Z","source.address":"1.2.2.3","message":"OK","server.address":"ip-1-2-2-3.ec2.internal","event.action":"ACCEPT","event.module":"S3bucket","source.port":"56448","network.protocol":"6","cloud.account.id":"512889038796","event.type":"AWS_VPC_log","organization.id":"DeloitteFusion","destination.address":"1.2.2.3","network.bytes":"178584","event.start":"2019-07-19T07:46:22.000Z","event.kind":"2","host.id":"eni-0c5e3a6282a912997","timestamp":"2019-07-19T07:51:52.584Z","srckey_val":"167772160_184549375","srckey_rev":"15072019_1541"}

Creating KSQL stream as create stream vpc_log ("destination.port" integer, "network.packets" integer, "event.end" varchar, "source.address" varchar, message varchar, "server.address" varchar, "event.action" varchar, "event.module" varchar, "source.port" integer, "network.protocol" integer, "cloud.account.id" bigint, "event.type" varchar, "organization.id" varchar, "destination.address" varchar, "network.bytes" integer, "event.start" varchar, "event.kind" integer, "host.id" varchar, timestamp varchar, srckey_val varchar, srckey_rev varchar) WITH (KAFKA_TOPIC='client_data_parsed', VALUE_FORMAT='JSON');

on running select * from vpc_log; threw the following error Caused by: Cannot create field because of field name duplication address .

So have modified the stream query as create stream vpc_log ("destination.port2" integer, "network.packets" integer, "event.end" varchar, "source.addres" varchar, message varchar, "server.adress" varchar, "event.action" varchar, "event.module" varchar, "source.port1" integer, "network.protocol" integer, "cloud.account.id2" bigint, "event.type" varchar, "organization.id" varchar, "destination.adres" varchar, "network.bytes" integer, "event.start" varchar, "event.kind" integer, "host.id1" varchar, timestamp varchar, srckey_val varchar, srckey_rev varchar) WITH (KAFKA_TOPIC='client_data_parsed', VALUE_FORMAT='JSON');

the output on select * is 1563522879847 | null | null | null | null | null | OK | null | null | null | null | null | null | null | null | null | null | null | null | null | 2019-07-19T07:51:52.584Z | 167772160_184549375 | 15072019_1541 1563522879847 | null | null | null | null | null | OK | null | null | null | null | null | null | null | null | null | null | null | null | null | 2019-07-19T07:51:52.584Z | 167772160_184549375 | 15072019_1541

Values for all the dotted(.) fields/keys is null . Im referring to link with no solution provided. Help me understand.

Tried the escape the dot as:

create stream vpc_log ("DESTINATION\.PORT1" INTEGER, "NETWORK\.PACKETS" INTEGER, "EVENT\.END" VARCHAR, "SOURCE\.ADDRESS1" VARCHAR, MESSAGE VARCHAR, "SERVER\.ADDRESS2" VARCHAR, "EVENT\.ACTION" VARCHAR, "EVENT\.MODULE" VARCHAR, "SOURCE\.PORT2" INTEGER,"NETWORK\.PROTOCOL" INTEGER,"CLOUD\.ACCOUNT\.ID" BIGINT,"EVENT\.TYPE" VARCHAR,"ORGANIZATION\.ID1" VARCHAR,"DESTINATION\.ADDRESS3" VARCHAR,"NETWORK\.BYTES" INTEGER,"EVENT\.START" VARCHAR,"EVENT\.KIND" INTEGER,"HOST\.ID2" VARCHAR,TIMESTAMP VARCHAR,SRCKEY_VAL VARCHAR,SRCKEY_REV VARCHAR)WITH (KAFKA_TOPIC='client_data_parsed',VALUE_FORMAT='JSON');

Still the same issue.

I could go with the query with no dot in the key as a quick work around. But, would like to know what is it with KSQL to have dot?

I've got this answered in slack community by ksql developers. Might help someone. KSQL doesn't have official support for the same but a workaround is to escape the period. With ksql v5.3.0 installed from here .

create stream vpc_log ("DESTINATION\.PORT" INTEGER,"NETWORK\.PACKETS" INTEGER,"EVENT\.END" VARCHAR,"SOURCE\.ADDRESS" VARCHAR,MESSAGE VARCHAR,"SERVER\.ADDRESS" VARCHAR,"EVENT\.ACTION" VARCHAR,"EVENT\.MODULE" VARCHAR,"SOURCE\.PORT" INTEGER,"NETWORK\.PROTOCOL" INTEGER,"CLOUD\.ACCOUNT\.ID" BIGINT,"EVENT\.TYPE" VARCHAR,"ORGANIZATION\.ID" VARCHAR,"DESTINATION\.ADDRESS" VARCHAR,"NETWORK\.BYTES" INTEGER,"EVENT\.START" VARCHAR,"EVENT\.KIND" INTEGER,"HOST\.ID" VARCHAR,TIMESTAMP VARCHAR,SRCKEY_VAL VARCHAR,SRCKEY_REV VARCHAR) WITH (KAFKA_TOPIC='client_data_parsed',

VALUE_FORMAT='JSON');

Thanks Robin Moffatt.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM