简体   繁体   English

使用Kafka流的事件数据聚合

[英]Event Data Aggregation Using Kafka Streaming

public class UserEvent {
    int userId;
    long loginTime;
    long jobId;
    long jobAttachTime;
    long jobdetachTime;
    long workTime;
    long logoutTime;
    long activeTime;
    EventType eventType;
}

I have an application where events are sent on a Kafka topic based on user actions like login, job attach, job detach and logout. 我有一个应用程序,其中基于登录,作业附加,作业分离和注销等用户操作,在Kafka主题上发送事件。 Each event has some information in a UserEvent object along with userId and eventType , for example a Login Event has loginTime ; 每个事件在UserEvent对象中都有一些信息以及userIdeventType ,例如, 登录事件具有loginTime a Job Attach event has the properties jobId , jobAttachTime . 作业附加事件的属性为jobIdjobAttachTime Similarly a Logout event has the property logoutTime . 同样, 注销事件具有属性logoutTime My requirement is to aggregate information from all these events into one object after receiving the Logout event for each user. 我的要求是在收到每个用户的注销事件后,将所有这些事件中的信息聚合到一个对象中。 So that after the logout event, the UserEvent object will have loginTime , logoutTime , calculated workTime , activeTime etc. How could this be achieved using Kafka KStreams and/or KTables ? 这样,在注销事件之后, UserEvent对象将具有loginTimelogoutTime ,计算出的workTimeactiveTime等。如何使用Kafka KStreams和/或KTables实现此目的

In order to aggregate the UserEvents, you need a key(example : session id) to filter the sessions which is common across all the events. 为了聚合UserEvents,您需要一个键(示例:会话ID)来过滤所有事件中共有的会话。 Let's say you have sessionID attached to each user event , which is unique for each session but same for all user events occured during that session. 假设您为每个用户事件附加了sessionID,该ID对于每个会话都是唯一的,但对于在该会话期间发生的所有用户事件而言,都是相同的。

It can be achieved using GroupBy().aggregate() in following way : ( considering you have session Id equivalent attribute which can be unique to use as a key ) 可以通过以下方式使用GroupBy().aggregate()来实现:( 考虑到您具有与会话ID等效的属性,该属性可以唯一地用作键

    // Let's say there is a  sessionID
KTable<String, UserEvent> userEventSummary = userEvents
                                  .groupBy(event -> event.get("sessionId"))
                                  .aggregate((userEventSummary,userEvent)->{
                                        userEventSummary = userEvent;
                                        if(new!= null){
                                            String loginEvent = new.get("eventType").get("eventName");
                                            if(loginEvent.equals("login")){
                                                userEventSummary.setLoginTime(new.getLoginTime());
                                            }
                                            if(loginEvent.equals("logOut")){
                                                long workTime = Math.abs(userEvent.getLogOutTime()-userEventSummary.getLoginTime());
                                                userEventSummary.setWorkTime(workTime);
                                                userEventSummary.setActiveTime(workTime);
                                            }
                                        }
                                        return userEventSummary;
                                   });

// if default value for logoutTime is 0, filter the user events which don't have logout time yet
KTable<String, UserEvent>  loggedOutEventSummary = userEventSummary.filter(event-> event.getLogOutTime()!= 0);

It will return the aggregated state for each user action filtered by user events which have Logout events. 它将为具有注销事件的用户事件过滤的每个用户操作返回汇总状态。

final KStream<Integer, UserEvent> kStream = builder.stream("test-topic");
kStream.groupByKey().aggregate(() -> new UserEvent(),
                (Integer userId, UserEvent userEvent, UserEvent userEventSummary) -> {
                    if (userEvent.getEventType().equals(EventType.LOGIN)) {
                        // Event Processing logic for Login event
                    } else if (userEvent.getEventType().equals(EventType.LOGOUT)) {
                        // Event Processing logic for Logout event
                    }
                    return userEventSummary;
                });

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM