简体   繁体   English

来自SQL中多个列的DISTINCT值

[英]DISTINCT values from multiple columns in sql

I have the following assignment: 我有以下作业:

Write a SELECT statement that returns one row for each customer that has the same last name and billing address as another customer. 编写一条SELECT语句,为每个姓氏和帐单地址与另一个客户相同的客户返回一行。 Sort the result set by last name then first name. 将结果集按姓氏和名字排序。

I have tried doing it with the DISTINCT keyword but that does not serve my purpose. 我尝试使用DISTINCT关键字来执行此操作,但这不符合我的目的。

For some reason every time I use GROUP BY I get the following error: 出于某种原因,每次我使用GROUP BY ,都会出现以下错误:

SELECT FirstName, LastName, BillingAddressID
    FROM Customers
    GROUP BY LastName;

Column 'Customers.FirstName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. 选择列表中的“ Customers.FirstName”列无效,因为它既不包含在聚合函数中,也不包含在GROUP BY子句中。

I tried UNION with the following error: 我尝试了UNION并出现以下错误:

SELECT LastName FROM Customers

UNION 

SELECT BillingAddressID FROM Customers

Conversion failed when converting the varchar value 'Sherwood' to data type int. 将varchar值“ Sherwood”转换为数据类型int时,转换失败。

Can someone just point me in the right direction? 有人可以指出我正确的方向吗?

This is a sample of the data set that I am working on 这是我正在处理的数据集的示例

firstname   lastname    billingaddressid  
Allan       Sherwood        2  
Barry       Zimmer          3  
Christine   Brown           4  
David       Goldstein       6  
Erin        Sherwood        7  
Frank Lee   Wilson          8  
Gary        Brown           4  
Heather     Esway           12  

So, the query should remove the duplicate entries..of the lastname 因此,查询应删除姓氏的重复条目。

Try - 尝试-

SELECT FirstName,
       Customers.LastName,
       Customers.billingAddressID
FROM Customers INNER JOIN
     ( SELECT LastName,
              billingAddressID
       FROM Customers
       GROUP BY LastName,
                billingAddressID
       HAVING COUNT( LastName ) >= 2 ) lastNameQuery
  ON Customers.LastName = lastNameQuery.LastName AND
     Customers.billingAddressID = lastNameQuery.billingAddressID
ORDER BY Customers.LastName,
         FirstName;

I tested this against a database created using the following script... 我针对使用以下脚本创建的数据库进行了测试...

CREATE DATABASE Cust;

USE Cust;

CREATE TABLE Customers
(
    fldID               INT              NOT NULL    AUTO_INCREMENT,
    firstName           VARCHAR( 50 )    NOT NULL,
    lastName            VARCHAR( 50 ),
    billingAddressID    INT              NOT NULL,
    PRIMARY KEY ( fldID )
);

I entered the Questioner's sample data using - 我使用-输入了发问者的样本数据

INSERT INTO Customers
SET firstName        = "Allan",
    lastName         = "Sherwood",
    billingAddressID = 2;

INSERT INTO Customers
SET firstName        = "Barry",
    lastName         = "Zimmer",
    billingAddressID = 3;

INSERT INTO Customers
SET firstName        = "Christine",
    lastName         = "Brown",
    billingAddressID = 4;

INSERT INTO Customers
SET firstName        = "David",
    lastName         = "Goldstein",
    billingAddressID = 6;

INSERT INTO Customers
SET firstName        = "Erin",
    lastName         = "Sherwood",
    billingAddressID = 7;

INSERT INTO Customers
SET firstName        = "Frank Lee",
    lastName         = "Wilson",
    billingAddressID = 8;

INSERT INTO Customers
SET firstName        = "Gary",
    lastName         = "Brown",
    billingAddressID = 10;

INSERT INTO Customers
SET firstName        = "Heather",
    lastName         = "Esway",
    billingAddressID = 12;

I also added the following to ensure a repeat of BOTH lastName AND billingAddressID... 我还添加了以下内容,以确保重复使用lastName和billingAddressID ...

INSERT INTO Customers
SET firstName        = "Don",
    lastName         = "Sherwood",
    billingAddressID = 22;

INSERT INTO Customers
SET firstName        = "Timmy",
    lastName         = "Sherwood",
    billingAddressID = 22;

INSERT INTO Customers
SET firstName        = "James",
    lastName         = "Brown",
    billingAddressID = 22;

INSERT INTO Customers
SET firstName        = "James",
    lastName         = "Esway",
    billingAddressID = 22;

The question being asked of our Questioner, and I assume the question that the Questioner is seeking help with, was - 我们的发问者正在询问的问题,我认为发问者正在寻求帮助的问题是-

Write a SELECT statement that returns one row for each customer that has the same last name and billing address as another customer. Sort the result set by last name then first name.

My interpretation of this was we should return records for EACH Customer meeting the criteria of having a particular combination BOTH a Last Name AND a BillingAddressID shared with AT LEAST one other Customer, and that the list of returned records should be sorted by Last Name and subsorted on First Name. 我对此的解释是,我们应该为每个客户返回符合以下条件的记录:具有姓氏和与AT LEAST至少一个其他客户共享的BillingAddressID的特定组合,并且返回记录的列表应按姓氏排序并再分类在名字上。

The core of my answer is the segment - 我的答案的核心是细分-

SELECT LastName,
       billingAddressID
FROM Customers

Which selects just the two conditional fields from Customers. 从客户中仅选择两个条件字段。

To this I added - 为此,我添加了-

GROUP BY LastName,
         billingAddressID

This will refine the core segments results into a list of each unique combination of the two conditional fields. 这会将核心细分结果细化为两个条件字段的每个唯一组合的列表。

I then restricted this list to those unique combinations that occur at least twice by adding - 然后,我通过添加-将此列表限制为至少出现两次的唯一组合。

HAVING COUNT( LastName ) >= 2

I then gave the resulting query an Alias of lastNameQuery . 然后,我给结果查询一个lastNameQuery的别名。

I then joined lastNameQuery with Customers on both conditional fields to restrict our list of Customers to those who had a shared pair of conditional values using - 然后,我在两个条件字段上都将lastNameQuery与Customers一起加入了Customers,以将我们的Customers列表限制为使用-使用一对共享的条件值的客户

     Customers INNER JOIN
     ( SELECT LastName,
              billingAddressID
       FROM Customers
       GROUP BY LastName,
                billingAddressID
       HAVING COUNT( LastName ) >= 2 ) lastNameQuery
  ON Customers.LastName = lastNameQuery.LastName AND
     Customers.billingAddressID = lastNameQuery.billingAddressID

From this I selected the desired fields using - 从中,我使用-选择了所需的字段。

SELECT FirstName,
       Customers.LastName,
       Customers.billingAddressID
FROM

Using Customers. 使用Customers. was necessitated by the need to clarify references to field names that occur in both Customers and lastNameQuery . 由于需要澄清对CustomerslastNameQuery中都出现的字段名称的引用,因此有必要。 Without this clarification MySQL is unable to determine which fields it should use. 没有这种澄清,MySQL无法确定应使用哪些字段。

This list was sorted into the specified order by adding - 通过添加-将此列表按指定顺序排序-

ORDER BY Customers.LastName,
         FirstName;

The results I got from testing my complete statement were - 我通过测试完整的陈述而得到的结果是-

+-----------+----------+------------------+
| FirstName | LastName | billingAddressID |
+-----------+----------+------------------+
| Gary      | Brown    |               10 |
| Tom       | Brown    |               10 |
| Don       | Sherwood |               22 |
| Timmy     | Sherwood |               22 |
+-----------+----------+------------------+

These are the only Customers in my expanded sample dataset to have a shared combination of values in the conditional fields. 这些是我的扩展样本数据集中唯一在条件字段中具有值的共享组合的客户。

Try this 尝试这个

SQL Fiddle SQL小提琴

   SELECT * FROM 
    (
    SELECT *,ROW_NUMBER() OVER(PARTITION BY LASTNAME ORDER BY FIRSTNAME) AS RN
    FROM YOURTABLE
    )T
    WHERE RN=2

If you want to return all unique records. 如果要返回所有唯一记录。

SELECT * FROM 
    (
    SELECT *,ROW_NUMBER() OVER(PARTITION BY LASTNAME ORDER BY FIRSTNAME) AS RN
    FROM YOURTABLE
    )T
    WHERE RN=1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM