Select Top Row in a Group By without any numerical aggregation

Question

I am trying to build a Item Table in access. I have item number, mfg name, and product description. I would like my PK to be item number and mfg name. However, I have about 5k areas where the product description is creating duplicates based on a slight variation in the description itself. I would like to just have access create the table by grouping all items based on item number and mfg name and then select the first result.

NOTE: the method I have attempted below uses MIN/MAX. This does NOT have to be the method suggested. Ultimate goal is to select the top row or a single row for each group So if i have 2 part numbers that say 123 and 2 product descriptions for that part number, I just want one of those descriptions to display. It does NOT matter which one.

Example:

Item_Num, MFG_Name, Product_Desc

414001000, AMBU INC., ASCOPE 3,LARGE,5.8/2.8 5EA/BX

414001000, AMBU INC., ASCOPE 3,LARGE,5.8/2.8 5EA/BX

06L21-01, ABBOTT LABORATORIES INC, 07K0040AT HAVAB-M CALB 4ML RX

06L21-01, ABBOTT LABORATORIES INC, ARCHITECT HAVAB-M CALB 4ML RX

Ideally, this is my result:

Item_Num, MFG_Name, Product_Desc

414001000, AMBU INC., ASCOPE 3,LARGE,5.8/2.8 5EA/BX

06L21-01, ABBOTT LABORATORIES INC, 07K0040AT HAVAB-M CALB 4ML RX

Idea so far that I have is to count the length of the description to quantify. Then use min/max to select the one that is desired. My code so far is:

SELECT
x.distributor_item_number, 
x.mfg_item_number, 
x.mfg_name, 
x.distributor_product_description, 
min(x.[LENGTH OF DESC]) 

INTO Product_Table
FROM [Product Table] AS x 

INNER JOIN 
(SELECT p.distributor_item_number, 
max(p.[LENGTH OF DESC]) AS [MAX LENGTH] 
FROM [Product Table] AS p 
GROUP BY p.distributor_item_number)  AS y ON (y.distributor_item_number = x.distributor_item_number) AND (y.[MAX LENGTH] = X.[LENGTH OF DESC])

GROUP BY x.distributor_item_number, x.mfg_item_number, x.mfg_name, x.distributor_product_description;

However, it doesn't seem to be working. I am still having duplicates in the data.

Any help would be wonderful.

Answer 1

I ended up adding a sequencing number in the select statement that would sequence for each group. Then I just selected the first row of each sequence. Code below

SELECT 

p1.mfg_item_number, 
p1.mfg_name, 
p1.distributor_product_description,
Count(*) AS Seq

INTO Clean_Product_Table

FROM Product_Table AS p1 

INNER JOIN Product_Table AS p2 ON (p2.mfg_item_number = p1.mfg_item_number) 
                               AND (P2.MFG_NAME = P1.MFG_NAME) 
                               AND (p2.InoSeq <= p1.InoSeq)

GROUP BY p1.mfg_item_number, 
         p1.mfg_name, 
         p1.distributor_product_description

HAVING COUNT(*) = 1
ORDER BY 1, 2, 5;

Select Top Row in a Group By without any numerical aggregation

Question

1 answers

solution1
0 ACCPTED 2019-03-18 15:03:54

Select Top Row in a Group By without any numerical aggregation

Question

1 answers

solution1 0 ACCPTED 2019-03-18 15:03:54

solution1
0 ACCPTED 2019-03-18 15:03:54