简体   繁体   English

T-SQL - 使用 STUFF 连接分组列并删除重复项

[英]T-SQL - using STUFF to concatenate grouped columns and removing duplicates

I have a table that looks like this:我有一个看起来像这样的表:

EmailAddress: nvarchar(255)
MarketingEmailOptIn: nvarchar(50)
NewsletterOptIn: nvarchar(50)
ThoughtLeaderOptIn: nvarchar(50)

在此处输入图像描述

My SQL statement shown below takes the data above and concatenates the "Subscription Type" using a comma as the delimiter:下面显示的我的 SQL 语句采用上面的数据并使用逗号作为分隔符连接“订阅类型”:

SELECT  
    EmailAddress,
    STUFF((SELECT ',' + 
              CASE
                 WHEN B.MarketingEmailOptIn = 'TRUE' THEN 'MarketingEmail' 
                 WHEN B.ThoughtLeaderOptIn = 'TRUE' THEN 'ThoughtLeader'
                 WHEN B.NewsletterOptIn = 'TRUE' THEN 'Newsletter'
              END
          FROM UK_AGT_AgentForms_TEST_DE B 
          WHERE ISNULL(B.EmailAddress, '') = ISNULL(A.EmailAddress, '')
          FOR XML PATH('')), 1, 2, '') AS Subscriptions
FROM
    UK_AGT_AgentForms_TEST_DE A
GROUP BY 
    EmailAddress 

Running this SQL produces the following output:运行此 SQL 会产生以下 output:

在此处输入图像描述

However notice that MarketingEmail is listed twice because the source table ALSO has it listed twice (1st and 2nd rows).但是请注意, MarketingEmail列出了两次,因为源表也列出了两次(第 1 行和第 2 行)。 I need to omit any duplicate detected, so that my resulting table would look like:我需要省略检测到的任何重复项,以便生成的表如下所示:

在此处输入图像描述

I'm pretty new to the STUFF keyword.我对STUFF关键字很陌生。 I'm just kind of lost on how to detect duplicates at run time - any advice is appreciated.我只是有点迷失如何在运行时检测重复项 - 任何建议都值得赞赏。 Thanks谢谢

Try something like this:尝试这样的事情:

DECLARE @Data table (
    EmailAddress nvarchar(255),
    MarketingEmailOptIn nvarchar(50),
    NewsletterOptIn nvarchar(50),
    ThoughtLeaderOptIn nvarchar(50)
);

INSERT INTO @Data VALUES
    ( 'mike@mikemarks.com', 'TRUE', NULL, NULL ),
    ( 'mike@mikemarks.com', 'TRUE', 'TRUE', NULL ),
    ( 'mike@mikemarks.com', 'TRUE', NULL, 'TRUE' );

SELECT
    EmailAddress
    , STUFF ( ( CASE WHEN EOptIn = 'TRUE' THEN ',MarketingEmail' ELSE '' END
        + CASE WHEN NOptIn = 'TRUE' THEN ',Newsletter' ELSE '' END
        + CASE WHEN TOptIn = 'TRUE' THEN ',ThoughtLeader' ELSE '' END 
    ), 1, 1, '' ) AS Subscriptions
FROM (

    SELECT TOP 100 PERCENT
        EmailAddress
        , MAX ( MarketingEmailOptIn ) AS EOptIn
        , MAX ( NewsletterOptIn ) AS NOptIn
        , MAX ( ThoughtLeaderOptIn ) AS TOptIn
    FROM @Data A --UK_AGT_AgentForms_TEST_DE
    GROUP BY EmailAddress
    ORDER BY EmailAddress

) AS x
ORDER BY 
    EmailAddress;

Returns退货

+--------------------+-----------------------------------------+
|    EmailAddress    |              Subscriptions              |
+--------------------+-----------------------------------------+
| mike@mikemarks.com | MarketingEmail,Newsletter,ThoughtLeader |
+--------------------+-----------------------------------------+

If you have Sql Server 2017 or later, you can use String_agg() to simplify this:如果您有 Sql Server 2017 或更高版本,则可以使用String_agg()来简化此操作:

SELECT   
    EmailAddress,
        STRING_AGG(CASE
                 WHEN MarketingEmailOptIn = 'TRUE' THEN 'MarketingEmail' 
                 WHEN ThoughtLeaderOptIn = 'TRUE' THEN 'ThoughtLeader'
                 WHEN NewsletterOptIn = 'TRUE' THEN 'Newsletter'
              END, ', ') AS Subscriptions
FROM
    UK_AGT_AgentForms_TEST_DE
GROUP BY 
    EmailAddress

If you still see duplicates, you can use conditional aggregation in a nested query to roll it up first:如果您仍然看到重复项,您可以在嵌套查询中使用条件聚合先将其汇总:

SELECT  
    EmailAddress,
          CASE WHEN MarketingEmailOptIn > 0 THEN 'MarketingEmail,' ELSE '' END
        + CASE WHEN ThoughtLeaderOptIn > 0 THEN 'ThoughtLeader,' ELSE '' END
        + CASE WHEN NewsletterOptIn = > 0 THEN 'Newsletter' ELSE '' END
         AS Subscriptions
FROM (
    SELECT EmailAddress
        , SUM(CASE WHEN MarketingEmailOptIn = 'TRUE' THEN 1 ELSE 0 END) MarketingEmailOptIn
        , SUM(CASE WHEN ThoughtLeaderOptIn = 'TRUE' THEN 1 ELSE 0 END) ThoughtLeaderOptIn
        , SUM(CASE WHEN NewsletterOptIn = 'TRUE' THEN 1 ELSE 0 END) NewsletterOptIn
    FROM UK_AGT_AgentForms_TEST_DE
    GROUP BY EmailAddress
) T

Pewh.皮尤。 I had to play around with this one.我不得不玩这个。 Maybe not the perfect solution, but I think I was able to achieve what you are trying.也许不是完美的解决方案,但我认为我能够实现您正在尝试的目标。 It doesn't use the stuff function though.它虽然不使用 function 的东西。 It just concats each string and then removes the last comma.它只是连接每个字符串,然后删除最后一个逗号。

SELECT EmailAddress, CASE WHEN LEN(Subscriptions) > 0 THEN LEFT(Subscriptions, LEN(Subscriptions) - 1) ELSE '' END AS Subscriptions
FROM (
    SELECT EmailAddress, CONCAT(
            CASE WHEN SUM(CASE WHEN MarketingEmailOptIn = 'TRUE' THEN 1 ELSE 0 END) > 0 THEN 'MarketingEmail, ' ELSE '' END,
            CASE WHEN SUM(CASE WHEN NewsletterOptIn = 'TRUE' THEN 1 ELSE 0 END) > 0 THEN 'Newsletter, ' ELSE '' END,
            CASE WHEN SUM(CASE WHEN ThoughtLeaderOptIn = 'TRUE' THEN 1 ELSE 0 END) > 0 THEN 'ThoughLeader, ' ELSE '' END
        ) AS Subscriptions
    FROM UK_AGT_AgentForms_TEST_DE 
    GROUP BY EmailAddress
) AS a

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM