简体   繁体   English

SQL 检索在一定时间内进行交易的用户名

[英]SQL to retrieve name of user making transactions under certain time

I am trying to create features for my ML work on a grocery customers data.我正在尝试为我在杂货店客户数据上的 ML 工作创建功能。

The data has transaction which user makes in buying groceries.数据有用户在购买杂货时进行的交易。

I am trying to find the name of the users who have made consecutive transactions within 30 seconds time frame.我正在尝试查找在 30 秒时间范围内进行连续交易的用户的名称。 This is important to get a profile of such users这对于获取此类用户的个人资料很重要

So for example if data looks like below:例如,如果数据如下所示:

 User    Datetime            Amount
    1   Mary    2020-11-30 10:10:20 24
    2   Jacob   2020-11-30 12:12:12 43.2
    3   Alice   2020-11-30 11:11:11 75.29
    4   Mary    2020-11-30 10:10:45 34
    5   Mary    2020-11-30 10:11:15 21
    6   Alice   2020-11-30 11:11:41 100

the correct answer would be Alice as only Alice had more than 1 transactions which are within 30 seconds time frame.正确的答案是爱丽丝,因为只有爱丽丝在 30 秒的时间范围内有超过 1 笔交易。

Mary might appear as probable answer but not all consecutive transactions had 30 seconds gap. Mary 可能看起来是可能的答案,但并非所有连续事务都有 30 秒的间隔。 It had 25 and 30. So correct answer we need is Alice它有 25 和 30。所以我们需要的正确答案是 Alice

One method is lag() to get the time of the previous transaction.一种方法是lag()来获取上一个事务的时间。 The following returns the transactions that are within 30 seconds:以下返回 30 秒内的事务:

select t.*
from (select t.*,
             lag(datetime) over (partition by user order by datetime) as prev_datetime
      from t
     ) t
where prev_datetime > datetime - interval '30 second';

This syntax uses standard SQL;此语法使用标准 SQL; date/time functions vary among databases, so the exact syntax depends on the database you are using.日期/时间函数因数据库而异,因此确切的语法取决于您使用的数据库。

It is unclear how you want to summarize this to get Alice but not Mary.目前还不清楚你想如何总结这个来得到 Alice 而不是 Mary。

If you need for all transactions to be exactly 30 seconds, you can use:如果您需要将所有事务精确到 30 秒,您可以使用:

select user
from (select t.*,
             lag(datetime) over (partition by user order by datetime) as prev_datetime
      from t
     ) t
group by user
having sum(prev_datetime <> datetime - interval 30 second) = 0;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM