Have a data set similiar to this.
Customer_id PART_N PART_C TXN_ID
B123 268888 7902/7900 159
B123 12839 82900/8900 1278
B869 12839 8203/890025/7902 17890
B290 268888 62820/12839 179018
not sure how to combine PART_N and PART_C and find count(distinct customer_id) for each part the same part could be in PART_N or PART_C like part number 12839
I am interested in getting as following table using teradata
Part COUNT(Distinct Customer id)
268888 2
12839 3
7902 2
7900 1
82900 1
8900 1
8203 1
890025 1
62820 1
if it was just PART_N then it would be straight forward as just one part number is present per row. Unsure how I combine every part number and find how many distinct customer id each one has. If it helps I have all the list of distinct Part numbers in one table say table2.
I cannot not try this code, so see it as pseudocode and sketch of an idea.
SELECT numbers, COUNT(numbers)
FROM
(SELECT
REGEXP_SPLIT_TO_TABLE( -- B
CONCAT(PART_N, '/', PART_C), -- A
'/'
) as numbers
FROM table) s
GROUP BY numbers -- C
A: Concatenation of both columns into one string divided by the delimiter '/'
B: Split string by delimiter
C: Group string parts and count them
http://www.teradatawiki.net/2014/05/regular-expression-functions.html
This is pretty ugly.
First let's split those delimited strings up, using strtok_split_to_table
.
create volatile table vt_split as (
select
txn_id,
token as part
from table
(strtok_split_to_table(your_table.txn_id,your_table.part_c,'/')
returns (txn_id integer,tokennum integer,token varchar(10))) t
)
with data
primary index (txn_id)
on commit preserve rows;
That will give you all those split apart, with the appropriate txn_id. Then we can union that with the part_n values.
create volatile table vt_merged as (
select * from vt_split
UNION ALL
select
txn_id,
cast(part_n as varchar(10)) as part
from
vt_foo)
with data
primary index (txn_id)
on commit preserve rows;
Finally, we can join that back to your original table to get the counts of customer by part.
select
vt_merged.part,
count (distinct yourtable.customer_id)
from
vt_merged
inner join yourtable
on vt_merged.txn_id = yourtable.txn_id
group by 1
This could probably done a little bit cleaner, but it should get you what you're looking for.
This is @S-Man's pseudocode as working query:
WITH cte AS
(
SELECT Customer_id,
Trim(PART_N) ||'/' || PART_C AS all_parts
FROM tab
)
SELECT
part, -- if part should be numeric: Cast(part AS INT)
Count(DISTINCT Customer_id)
FROM TABLE (StrTok_Split_To_Table(cte.Customer_id, cte.all_parts, '/')
RETURNS (Customer_id VARCHAR(10), tokennum INTEGER, part VARCHAR(30))) AS t
GROUP BY 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.