I've got 2 tables of User ID's and emails. A user can change their email but keep the same user ID (row 2 and row 5 of USER_PLAYS table). A user can also create a new user ID with an existing email (row 3 of USER_PLAYS table). I want to be able to sum up the total plays for this user into a single row. There is also another table with sales value that I would like to get the total sales. I'm thinking somehow to create a unique ID that is the same across all these fields but not sure how to implement it.
Note that I've only shown 1 actual person but there are multiple more unique people in these tables.
I am using Snowflake as that is where the data is.
USER_PLAYS table:
|ROW|USER_ID | EMAIL |VIDEO_PLAYS|
|---|-----------|--------------------|-----------|
|1 | 1 | ab@gmail.com | 2 |
|2 | 1 | cd@gmail.com | 3 |
|3 | 3 | cd@gmail.com | 4 |
|4 | 4 | cd@gmail.com | 2 |
|5 | 4 | ef@gmail.com | 3 |
Sales Table:
|NET_SALE | EMAIL |
|-----------|-------------|
|5 | cd@gmail.com|
|10 | ef@gmail.com|
Desired Output:
|UNIQUE_ID | PLAYS |NET_SALE|
|-----------|-------|--------|
| 1 | 14 | 15 |
This may have opportunities for additional efficiencies, but I think this process works to get you the unique identifier across your user_id / email combinations.
For this process I added another column called COMMON_ID to the user_plays table. This joined with the NET_SALES table by email_id, can be aggregated to the sales against the COMMON_ID (see results below):
-- Create the test case
create
or replace table user_plays (
user_id varchar not null,
email varchar not null,
video_plays integer not null,
common_id integer default NULL
);
insert into
user_plays
values
(1, 'ab@gmail.com', 2, null),
(1, 'cd@gmail.com', 3, null),
(3, 'cd@gmail.com', 4, null),
(4, 'cd@gmail.com', 2, null),
(4, 'ef@gmail.com', 3, null),
(5, 'jd@gmail.com', 10, null),
(6, 'lk@gmail.com', 1, null),
(6, 'zz@gmail.com', 2, null),
(7, 'zz@gmail.com', 3, null);
create
or replace table sales (net_sale integer, email varchar);
insert into
sales
values
(5, 'cd@gmail.com'),(10, 'ef@gmail.com');
-- Test run
-- Create view for User IDs with multiple emails
create
or replace view grp1 as (
select
user_id,
count(*) as mult
from
user_plays
group by
user_id
having
count(*) > 1
);
-- Create view for Emails with multiple user IDs
create
or replace view grp2 as (
select
email,
count(*) as mult
from
user_plays x
group by
email
having
count(*) > 1
);
EXECUTE IMMEDIATE $$
declare new_common_id integer;
counter integer;
Begin
counter := 0;
new_common_id := 0;
-- Basline common_id to NULL
update
user_plays
set
common_id = NULL;
-- Mark all unique entries with a common_id = user_id
update
user_plays
set
common_id = user_id
where
email not in (
select
distinct email
from
grp2
)
and user_id not in (
select
distinct user_id
from
grp1
);
-- Set a common_id to the lowest user_id value for each user_id with multiple emails
LOOP
select count(*) into :counter
from
user_plays
where
common_id is null;
if (counter = 0) then BREAK;
end if;
select
min(user_id) into :new_common_id
from
user_plays
where
common_id is null;
-- first pass
update
user_plays
set
common_id = :new_common_id
where
common_id is null and
(user_id = :new_common_id
or email in (
select
email
from
user_plays
where
user_id = :new_common_id
));
END LOOP;
-- Update the chain where an account using a changed email created a new user_id to match up with prior group.
UPDATE user_plays vp
set vp.common_id = vp2.common_id
from (select user_id, min(common_id) as common_id from user_plays group by user_id) vp2
where vp.user_id = vp2.user_id;
END;
$$;
-- See results
select
*
from
user_plays;
select
x.common_id,
vps.video_plays,
sum(x.net_sale) as net_sale
from
(
select
common_id,
sum(video_plays) as video_plays
from
user_plays
group by
common_id
) vps,
(
select
s.email,
s.net_sale,
max(up.common_id) as common_id
from
sales s,
user_plays up
where
up.email = s.email
group by
s.email,
s.net_sale
) x
where
vps.common_id = x.common_id
group by
x.common_id,
vps.video_plays;
Common ID assignment Results:
USER_ID EMAIL VIDEO_PLAYS COMMON_ID
1 ab@gmail.com 2 1
1 cd@gmail.com 3 1
3 cd@gmail.com 4 1
4 cd@gmail.com 2 1
4 ef@gmail.com 3 1
5 jd@gmail.com 10 5
6 lk@gmail.com 1 6
6 zz@gmail.com 2 6
7 zz@gmail.com 3 6
Final Results:
COMMON_ID VIDEO_PLAYS NET_SALE
1 14 15
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.