简体   繁体   中英

KSQL Event Merging - Combining events from a single stream based on timestamp

I'm trying to combine multiple events from a single input stream into a single output event grouped by timestamp using ksql. I would also like the output event to contain an average of the input events, although this isn't strictly nessersay and is more a nice to have.

Input Stream: Temperature

event1: {location: "hallway", value: 23, property_Id: "123", timestamp: "1551645625878"} 
event2: {location: "bedroom", value: 21, property_Id: "123", timestamp: "1551645625878"}
event3: {location: "kitchen", value: 20, property_Id: "123", timestamp: "1551645625878"}
event4: {location: "hallway", value: 19, property_Id: "123", timestamp: "9991645925878"} 
event5: {location: "bedroom", value: 18, property_Id: "123", timestamp: "9991645925878"}
event6: {location: "kitchen", value: 18, property_Id: "123", timestamp: "9991645925878"}

(desired) Output Stream:

event1:
{
    "property_id": "123",
    "timestamp": "1551645625878",
    "average_temperature": 21,   
    "temperature": [
        {
            "location": "hallway",
            "value": 23
        },
        {
            "location": "bedroom",
            "value": 21
        },
        {
            "location": "kitchen",
            "value": 20
        }
    ]
}

event2:
{
    "property_id": "123",
    "timestamp": "9991645925878",
    "average_temperature": 18,   
    "temperature": [
        {
            "location": "hallway",
            "value": 19
        },
        {
            "location": "bedroom",
            "value": 18
        },
        {
            "location": "kitchen",
            "value": 18
        }
    ]
}

As far as I can tell, this just isn't possible using ksql, can anyone confirm?

Correct, you cannot do this in KSQL currently. As of v5.1 / March 2019 KSQL can read, but not build, nested objects: https://github.com/confluentinc/ksql/issues/2147 (please upvote/comment if you need this)

You could do the average calculation though with something like:

SELECT timestamp, SUM(value)/COUNT(*) AS avg_temp \
  FROM input_stream \
  GROUP BY timestamp;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM