简体   繁体   English

在SQL中选择2列的不同组合

[英]selecting a distinct combination of 2 columns in SQL

When i run a select after a number of joins on my table I have an output of 2 columns and I want to select a distinct combination of col1 and col2 for the rowset returned. 当我在我的表上有多个连接后运行一个select时,我有一个2列的输出,我想为返回的行集选择col1和col2的不同组合。

the query that i run will be smthing like this: 我运行的查询将是这样的:

select a.Col1,b.Col2 from a inner join b on b.Col4=a.Col3

now the output will be somewhat like this 现在输出有点像这样

Col1 Col2  
1   z  
2   z  
2   x  
2   y  
3   x  
3   x  
3   y  
4   a  
4   b  
5   b  
5   b  
6   c  
6   c  
6   d  

now I want the output should be something like follows 现在我想输出应该是如下

1  z  
2  y  
3  x  
4  a  
5  b  
6  d 

its ok if I pick the second column randomly as my query output is like a million rows and I really dnt think there will be a case where I will get Col1 and Col2 output to be same even if that is the case I can edit the value.. 如果我随机选择第二列就可以了,因为我的查询输出就像一百万行而且我真的认为有一种情况我会让Col1和Col2输出相同,即使是这样我可以编辑值..

Can you please help me with the same.. I think basically the col3 needs to be a row number i guess and then i need to selct two cols bases on a random row number.. I dont know how do i transalte this to SQL 你可以帮我一样吗...我认为基本上col3需要是一个行号我猜,然后我需要在随机行号上选择两个cols基础..我不知道如何将它转换为SQL

consider the case 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e now group by will give me all these results where as i want 1a and 2d or 1a and 2b. 考虑案例1a 1b 1c 1d 1e 2a 2b 2c 2d 2e现在group by将给出所有这些结果,因为我想要1a和2d或1a和2b。 any such combination. 任何这样的组合。

OK let me explain what im expecting: 好吧,让我解释一下我的期望:

with rs as(
select a.Col1,b.Col2,rownumber() as rowNumber from a inner join b on b.Col4=a.Col3)
select rs.Col1,rs.Col2 from rs where rs.rowNumber=Round( Rand() *100)

now I am not sure how do i get the rownumber or the random working correctly!! 现在我不知道如何让rownumber或随机正常工作!

Thanks in advance. 提前致谢。

If you simply don't care what col2 value is returned 如果您根本不关心返回的col2

select a.Col1,MAX(b.Col2) AS Col2
from a inner join b on b.Col4=a.Col3 
GROUP BY a.Col1

If you do want a random value you could use the approach below. 如果您确实需要随机值,可以使用以下方法。

 ;WITH T
     AS (SELECT a.Col1,
                b.Col2
                ROW_NUMBER() OVER (PARTITION BY a.Col1 ORDER BY (SELECT NEWID())
                ) AS RN
         FROM   a
                INNER JOIN b
                  ON b.Col4 = a.Col3)
SELECT Col1,
       Col2
FROM   T
WHERE  RN = 1  

Or alternatively use a CLR Aggregate function. 或者使用CLR聚合函数。 This approach has the advantage that it eliminates the requirement to sort by partition, newid() an example implementation is below. 这种方法的优点是它partition, newid()partition, newid()排序partition, newid()下面是一个示例实现。

using System;
using System.Data.SqlTypes;
using System.IO;
using System.Security.Cryptography;
using Microsoft.SqlServer.Server;

[Serializable]
[SqlUserDefinedAggregate(Format.UserDefined, MaxByteSize = 8000)]
public struct Random : IBinarySerialize
{
    private MaxSoFar _maxSoFar;

    public void Init()
    {
    }

    public void Accumulate(SqlString value)
    {
        int rnd = GetRandom();
        if (!_maxSoFar.Initialised || (rnd > _maxSoFar.Rand))
            _maxSoFar = new MaxSoFar(value, rnd) {Rand = rnd, Value = value};
    }

    public void Merge(Random group)
    {
        if (_maxSoFar.Rand > group._maxSoFar.Rand)
        {
            _maxSoFar = group._maxSoFar;
        }
    }

    private static int GetRandom()
    {
        var buffer = new byte[4];

        new RNGCryptoServiceProvider().GetBytes(buffer);
        return BitConverter.ToInt32(buffer, 0);
    }

    public SqlString Terminate()
    {
        return _maxSoFar.Value;
    }

    #region Nested type: MaxSoFar

    private struct MaxSoFar
    {
        private SqlString _value;

        public MaxSoFar(SqlString value, int rand) : this()
        {
            Value = value;
            Rand = rand;
            Initialised = true;
        }

        public SqlString Value
        {
            get { return _value; }
            set
            {
                _value = value;
                IsNull = value.IsNull;
            }
        }

        public int Rand { get; set; }

        public bool Initialised { get; set; }
        public bool IsNull { get; set; }
    }

    #endregion


    #region IBinarySerialize Members

    public void Read(BinaryReader r)
    {
        _maxSoFar.Rand = r.ReadInt32();
        _maxSoFar.Initialised = r.ReadBoolean();
        _maxSoFar.IsNull = r.ReadBoolean();

        if (_maxSoFar.Initialised && !_maxSoFar.IsNull)
            _maxSoFar.Value = r.ReadString();
    }

    public void Write(BinaryWriter w)
    {
        w.Write(_maxSoFar.Rand);
        w.Write(_maxSoFar.Initialised);
        w.Write(_maxSoFar.IsNull);

        if (!_maxSoFar.IsNull)
            w.Write(_maxSoFar.Value.Value);
    }

    #endregion
}

You need to group by a.Col1 to get distinct by only a.Col1 , then since b.Col2 is not included in the group you need to find a suitable aggregate function to reduce all values in the group to just one, MIN is good enough if you just want one of the values. 你需要通过组a.Col1仅得到不同a.Col1 ,则由于b.Col2组中不包括你需要找到一个合适的聚合函数,以减少该组中的所有值只是一个, MIN好如果您只想要其中一个值,那就足够了。

select a.Col1, MIN(b.Col2) as c2
from a 
inner join b on b.Col4=a.Col3
group by a.Col1

You must use a group by clause : 您必须使用group by子句:

select a.Col1,b.Col2 
from a 
inner join b on b.Col4=a.Col3
group by a.Col1

If I understand you correctly, you want to have one line for each combination in column 1 and 2. That can easily be done by using GROUP BY or DISTINCT for instance: 如果我理解正确,您希望在第1列和第2列中为每个组合添加一行。例如,可以使用GROUP BY或DISTINCT轻松完成:

SELECT col1, col2 SELECT col1,col2

FROM Your Join 来自你的加入

GROUP BY col1, col2 GROUP BY col1,col2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM