简体   繁体   English

如果时间跨度重叠且少于30分钟,则对日志记录条目进行分组?

[英]Group logging entries if time span overlap and less than 30 minutes?

I have log entries for a single user that exist in the following format: 我有一个单一用户的日志条目,其格式如下:

[unique id], [start time],[ end time]

So, in the following example entries: 因此,在以下示例条目中:

1,1100,1200
2,1030,1130
3,1420,1500
4,1519,1700

Find Sessions ie, group log entries into as 'sessions'. 查找会话,即将日志条目分组为“会话”。 The conditions to determine a session are: 确定会话的条件是:

  1. If two entries time span overlap, then they belong to the same session. 如果两个条目的时间跨度重叠,则它们属于同一会话。
  2. Or if not overlapping, but gap in between < 30, then they belong to the same session. 或者,如果不重叠,但差距在<30之间,则它们属于同一会话。

Example: Output should be like: 示例:输出应类似于:

Session 1: 1, 2
Session 2: 3, 4

Logic what I am thinking is: 我想的逻辑是:

  • Parse the string and load it in "LogEntries" class. 解析字符串并将其加载到“ LogEntries”类中。
  • Sort "entries" collection basis on "startTime". 根据“ startTime”对“条目”收集进行排序。 I have "LogEntries" class implemented "Comparable" interface. 我已经在“ LogEntries”类中实现了“ Comparable”接口。
  • Now iterate "entries" collection and get the required output. 现在,迭代“ entries”集合并获得所需的输出。 Output will be a list of string where each string will be comma separated. 输出将是一个字符串列表,其中每个字符串将以逗号分隔。

I came up with below code but I am confuse on how to work on point 3 logic above. 我想出了下面的代码,但是我对如何处理上面的第3点逻辑感到困惑。

  private static List<String> groupSessions(List<String> inputs) {
    List<String> output = new ArrayList<>();
    List<LogEntries> entries = new ArrayList<>();
    for (String input : inputs) {
      String[] arr = input.split(",");
      LogEntries entry =
          new LogEntries(Integer.parseInt(arr[0]), Integer.parseInt(arr[1]),
              Integer.parseInt(arr[2]));
      entries.add(entry);
    }

    // sort it basis on startTime
    Collections.sort(entries);

    // now iterate the entries list - this is where I am confuse
    for (int i = 0; i < entries.size(); i++) {
      // do some stuff
    }

    return output;
  }

Some thoughts: 一些想法:

  • you are representing your timestamps as int/Integer values. 您将时间戳表示为int / Integer值。 That allows for simple sorting, but will make later computations (like getting the delta between two timestamps harder). 这样可以进行简单的排序,但是会使以后的计算更加困难(例如,使两个时间戳之间的差变得更难)。 You could consider creating a distinct class to represent these hour:minute values. 您可以考虑创建一个不同的类来表示这些小时:分钟值。
  • for solving your task: start by doing that on a piece of paper. 解决任务的方法:首先在一张纸上做。 Take your input example and start by sorting that list based on the start times. 以您的输入示例为基础,然后根据开始时间对该列表进行排序。
  • looking at the sorted timestamps, look at the first entry. 查看排序后的时间戳,查看第一个条目。 Obviously, that must be the begin of a session. 显然,那一定是会议的开始。 Now you simply look at the end time of that first entry and the start time of the subsequent entry. 现在,您只需查看第一个条目的结束时间和后续条目的开始时间。 Overlap? 交叠? Then session one contuines to the end time of the second entry. 然后,会话一继续到第二个条目的结束时间。 No overlap, then you compute "start time (second) - end time first". 没有重叠,则计算“开始时间(秒)-结束时间优先”。 Smaller than 30 minutes? 小于30分钟? Session continues, so you compare against the next end time again. 会话继续进行,因此您可以与下一个结束时间进行比较。 Otherwise, a session ended, and the next entry is the begin of the next session. 否则,会话结束,下一个条目是下一个会话的开始。 Repeat. 重复。

Long story short: you have to first develop the algorithm that tells you how to determine sessions. 长话短说:您必须首先开发一种算法,该算法可以告诉您如何确定会话。 Then you turn that sequence of instructions into code. 然后,您将该指令序列转换为代码。 The key is to first conceptually dissect the big problem into its smallest parts and to then see how to bring them together. 关键是首先从概念上将大问题分解为最小的部分,然后查看如何将它们组合在一起。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM