[英]How do I separate a SQL partition by category: “row_number() over (partition by…”?
I am working with some app events data and looking to group the event sets of a specific action together, in order to grab the most recent event set.我正在处理一些应用程序事件数据,并希望将特定操作的事件集组合在一起,以获取最新的事件集。 The customer (customer_id) starts the event set with 'step 1' (EventStep) and can go all the way through step 4 (or can drop out at any step along the way).
客户 (customer_id) 使用“步骤 1”(EventStep) 启动事件集,并且可以 go 一直到步骤 4(或者可以在此过程中的任何步骤退出)。 The event set can be triggered by a few actions (EventTrigger).
事件集可以由几个动作(EventTrigger)触发。
Goal: Grab all the steps of the most recent event set, and identify the date (based on Timestamp) and EventTrigger.目标:获取最近事件集的所有步骤,并确定日期(基于时间戳)和 EventTrigger。
There should only be 1 EventTrigger for each event set but the way my code is written, it combined event steps from different EventTriggers ( if the customer advanced further along in previous attempts than in most recent attempts).每个事件集应该只有 1 个 EventTrigger,但我的代码编写方式,它结合了来自不同 EventTriggers 的事件步骤(如果客户在以前的尝试中比在最近的尝试中更进一步)。 How do I ensure the event steps are grouped by the EventTrigger?
如何确保事件步骤按 EventTrigger 分组?
SELECT * FROM (
SELECT customer_id
, EventStep
, Timestamp
, EventTrigger
, ROW_NUMBER() OVER (PARTITION BY customer_id, EventStep ORDER BY Timestamp DESC) AS row_num
FROM xxx_table
) xxx
WHERE row_num = 1
Image 1:图 1:
Image 2图 2
The ID
field is something I created that labels the events in the order that they happened so that you can visualize what I'm looking for better. ID
字段是我创建的,它按照事件发生的顺序标记事件,以便您可以更好地可视化我正在寻找的内容。
I think you want:我想你想要:
SELECT xxx.*
FROM (SELECT xxx.*,
ROW_NUMBER() OVER (PARTITION BY customer_id, EventStep ORDER BY Timestamp DESC) AS seqnum
FROM xxx_table xxx
) xxx
WHERE seqnum = 1;
Nothing in your question suggests that aggregation is necessary.您的问题中没有任何内容表明聚合是必要的。
EDIT:编辑:
Are you just looking for dense_rank()
:你只是在寻找
dense_rank()
:
select xxx.*,
dense_rank() over (partition by customer_id order by Timestamp) as seqnum
from xxx_table xxx;
To paraphrase, you want the most recent events for each customer, stopping at the most recent Step1?换句话说,您想要每个客户的最新事件,在最近的 Step1 处停止?
SELECT
xxx_table.*
FROM
xxx_table
INNER JOIN
(
SELECT customer_id, MIN(timestamp) AS timestamp
FROM xxx_table
WHERE EventStep = 'Step 1'
)
AS cust_endpoint
ON cust_endpoint.customer_id = xxx_table.customer_id
AND cust_endpoint.timestamp >= xxx_table.timestamp
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.