简体   繁体   中英

Aggregating SQL Data and concatenating strings / combining records

I am trying to aggregate data from one table into another. I've inherited this project; I did not design this database nor will I be able to change its format.

The [RawData] table will have 1 record per account, per ChannelCodeID. This table (where I currently have data) has the following fields:

[Account] int
[ChannelCodeID] int
[ChannelCode] varchar(10)

The [AggregatedData] table will have 1 record per account. This table (into which I need to insert data) has the following fields:

[Account] int
[Count] int
[Channel1] int
[Channel2] int
[Channel3] int
[Names] varchar(250)

For example, I might have the following records in my [RawData] table:

Account          ChannelCodeID     ChannelCode
12345            2                 ABC
12345            4                 DEF
12345            6                 GHI
54321            2                 ABC
54321            6                 GHI
99999            2                 ABC

And, after aggregating them, I would need to produce the following records in my [AggregatedData] table:

Account     Count    Chanel1   Channel2   Channel3    Names
12345       3        2         4          6           ABC.DEF.GHI
54321       2        2         6          0           ABC.GHI    
99999       1        2         0          0           ABC

As you can see, the count is how many records exist in my [RawData] table, Channel1 is the first ChannelCodeID, Channel2 is the second, and Channel3 is the third. If there are not enough ChannelCodeIDs from my [RawData] table, extra Channel columns get a '0' value. Furthermore, I need to concatenate the 'ChannelCode' column and store it in the 'Names' column of the [AggregatedData] table, but (obviously) if there is only one record, I don't want to add the '.'

I can't figure out how to do this without using a cursor and a bunch of variables - but I'm guessing there HAS to be a better way. This doesn't have to be super-fast since it will only run once a month, but it will have to process at least 10-15,000 records each time.

Thanks in advance...

EDIT:

ChannelCodes and ChannelCodeIDs map directly to each other and are always the same. For example, ChannelCodeID 2 is ALWAYS 'ABC'

Also, in the [AggregatedData] table, Channel1 is ALWAYS the lowest value, although this is incidental.

Test Data

DECLARE @TABLE TABLE (Account INT, ChannelCodeID INT, ChannelCode VARCHAR(10))
INSERT INTO @TABLE VALUES 
(12345 ,2 ,'ABC'),
(12345 ,4 ,'DEF'),
(12345 ,6 ,'GHI'),
(54321 ,2 ,'ABC'),
(54321 ,6 ,'GHI'),
(99999 ,2 ,'ABC')

Query

SELECT Account
      ,[Count]
      ,ISNULL([Channel1], 0) AS [Channel1]
      ,ISNULL([Channel2], 0) AS [Channel2]
      ,ISNULL([Channel3], 0) AS [Channel3]
      ,Names
FROM 
  (
    SELECT t.Account, T.ChannelCodeID, C.[Count]
          ,'Channel' + CAST(ROW_NUMBER() OVER 
                      (PARTITION BY t.Account ORDER BY t.ChannelCodeID ASC) AS VARCHAR(10))Channels
          ,STUFF((SELECT '.' + ChannelCode
                  FROM @TABLE 
                  WHERE Account = t.Account
                  FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)'),1,1,'') AS Names
    FROM @TABLE t INNER JOIN (SELECT Account , COUNT(*) AS [Count]
                              FROM @TABLE 
                              GROUP BY Account) c
    ON T.Account = C.Account
  )A
PIVOT (MAX(ChannelCodeID)
       FOR Channels
       IN ([Channel1],[Channel2],[Channel3])
      )  p

Result

╔═════════╦═══════╦══════════╦══════════╦══════════╦═════════════╗
║ Account ║ Count ║ Channel1 ║ Channel2 ║ Channel3 ║    Names    ║
╠═════════╬═══════╬══════════╬══════════╬══════════╬═════════════╣
║   12345 ║     3 ║        2 ║        4 ║        6 ║ ABC.DEF.GHI ║
║   54321 ║     2 ║        2 ║        6 ║        0 ║ ABC.GHI     ║
║   99999 ║     1 ║        2 ║        0 ║        0 ║ ABC         ║
╚═════════╩═══════╩══════════╩══════════╩══════════╩═════════════╝

-- Back up raw data into temp table

select * into #rawData FROM RawData

-- First, populate the lowest channel and base records

INSERT INTO AggregatedData (Account,Count,Channel1,Channel2,Channel3)
   SELECT AccountID,1,Min(ChannelCODEID),0,0
   FROM #RawData
   GROUP BY AccountID

-- Gives you something like this

 Account     Count    Chanel1   Channel2   Channel3    Names
    12345       1        2         0          0           NULL
    54321       1        2         6          0           NULL
    99999       1        2         0          0           NULL

--

DELETE FROM #rawData 
WHERE account + str(channelCodeID) in 
      (SELECT account + str(channelCodeID) FROM AggregatedData)

-- Now do an update

UPDATE AggregatedData SET channel2= xx.NextLowest,count= count+1
FROM
    (   SELECT AccountID,Min(ChannelCODEID) as NextLowest
       FROM #RawData
       GROUP BY AccountID ) xx
WHERE AggregatedData.account=xx.accountID

-- Repeat above for Channel3

You then need an update statement against the final aggregated table based on the channel id's. If not run often, I would suggest a UDF which takes 3 parameters and returns a string, some like

UPDATE AggregatedData SET [names] = dbo.BuildNameList(channel1,channel2,channel3)

Will run a bit slow, but still not bad overall

Hope this gives you some ideas

Query:

WITH CTE AS (SELECT Account, ChannelCodeID, ChannelCode, RANK() OVER (PARTITION BY Account ORDER BY ChannelCodeID) [ChRank] FROM RawData)
SELECT A.Account, COUNT(Account) [Count], ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=1),0) [Channel1],
       ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=2),0) [Channel2],
       ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=3),0) [Channel3],
       STUFF((SELECT '.'+ChannelCode FROM CTE WHERE A.Account=CTE.Account FOR XML PATH('')),1,1,'') [Names]
FROM RawData A
GROUP BY A.Account

This uses a Common Table Expression to group and then display the data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM