[英]Filter duplicates combining multiple columns
| name | year | latitude | longitude |
|--------------|------|----------|-----------|
| Cleveland | 1800 | 10 | 11 |
| Cleveland | 1810 | 10 | 11 |
| Medina | 1811 | 12 | 13 |
| Dayton | 1812 | 14 | 15 |
| Sandusky | 1105 | 50 | 50 |
| Mount Vernon | 1813 | 50 | 50 |
I want to select each unique combinations of latitude
and longitude
. 我想选择
latitude
和longitude
每个独特组合。 So I want to filter out any duplicate pairs. 所以我想过滤掉任何重复的对。 I also need to filter out any records whose year is less than
1500
. 我还需要过滤掉年份小于
1500
任何记录。
This is the subset I'm trying to achieve: 这是我想要实现的子集:
| name | year | latitude | longitude |
|--------------|------|----------|-----------|
| Cleveland | 1800 | 10 | 11 |
| Medina | 1811 | 12 | 13 |
| Dayton | 1812 | 14 | 15 |
| Mount Vernon | 1813 | 50 | 50 |
Each records year
is greater than 1500 and there aren't any duplicate lat,long pairs. 每个记录
year
大于1500,并且没有任何重复的纬度,长对。
I've tried to find a way to use DISTINCT
. 我试图找到一种方法来使用
DISTINCT
。 Nothing I've found has worked. 我发现的任何东西都没有用。
I also have tried using GROUP BY
: 我也尝试过使用
GROUP BY
:
SELECT *
FROM users
GROUP BY latitude, longitude
HAVING year > 1500;
The issue with the above query is that is eliminates both of the following records which contain the lat,long pair of 50,50: 上述查询的问题是消除了包含lat,long对50,50的以下两个记录:
| name | year | latitude | longitude |
|--------------|------|----------|-----------|
| Sandusky | 1105 | 50 | 50 |
| Mount Vernon | 1813 | 50 | 50 |
The group is eliminated because Sandusky's year
is less than 1500. I don't want Sandusky's record, but I do want Mount Vernon. 由于桑达斯基的
year
不到1500 year
该团体被淘汰了。我不想要桑达斯基的记录,但我确实想要弗农山。
I noticed that if if the two records where switched like so: 我注意到,如果这两个记录切换如此:
| name | year | latitude | longitude |
|--------------|------|----------|-----------|
| Mount Vernon | 1813 | 50 | 50 |
| Sandusky | 1105 | 50 | 50 |
...then the group's year is set as 1813 and the group is not eliminated. ......然后该组织的年份设定为1813年,该组织未被淘汰。 I thought maybe sorting by year would fix it, but it didn't:
我想也许按年分类会解决它,但它没有:
SELECT *
FROM users
GROUP BY latitude, longitude
HAVING year > 1500
ORDER BY year DESC;
Is what I'm attempting possible? 我正在尝试的是什么?
How about this? 这个怎么样?
SELECT `id`, `name`, MAX(users.year) as `year`, latitude, longitude
FROM users
WHERE year > 1500
GROUP BY latitude, longitude;
Results in: 结果是:
| 7 | Columbus | 1978 | 7 | 8
| 1 | Cleveland | 1800 | 10 | 11
| 3 | Medina | 1811 | 12 | 13
| 4 | Dayton | 1812 | 14 | 15
| 6 | Mount Vernon | 1813 | 50 | 50
The only difference is where the WHERE
/ HAVING
is, because it is before the GROUP BY
statement, it will do the filtering BEFORE the grouping happens and thus you get the desired result. 唯一的区别是
WHERE
/ HAVING
位置,因为它在GROUP BY
语句之前,它将在分组发生之前进行过滤,从而获得所需的结果。
The MAX(users.year)
ensure that you always get the largest year on the set. MAX(users.year)
确保您始终获得最大的一年。 If this doesn't matter to you, you can replace SELECT `id`, `name`, MAX(users.year) as `year`, latitude, longitude
with SELECT *
如果这不要紧,你可以替换
SELECT `id`, `name`, MAX(users.year) as `year`, latitude, longitude
与SELECT *
Maybe I didn't understand the problem, but it would be this simple: 也许我不明白这个问题,但这很简单:
select * from users u where u.year > 1500;
I don't know what you want to do in case there are more than one pair of the same coordinates with a year greater than 1500. 我不知道你想要做什么,以防有多对相同的坐标,一年大于1500。
How about this unless it is a misread. 除非是误读,否则这个怎么样。 I did read.
我读过了。 It makes assumptions like you want to not eliminate a different name with same lat,long
它假设你想要不要使用相同的lat,long来消除不同的名称
create table users
( id int auto_increment primary key,
name varchar(50) not null,
year int not null,
latitude int not null,
longitude int not null
);
truncate table users;
insert users (name,year,latitude,longitude) values
('Cleveland',1810,10,11),
('Medina',1811,12,13),
('Dayton',1812,14,15),
('Mount Vernon',1813,50,50),
('Sandusky',1105,50,50);
SELECT distinct name,year,latitude,longitude
FROM users
where year > 1500
ORDER BY year;
+--------------+------+----------+-----------+
| name | year | latitude | longitude |
+--------------+------+----------+-----------+
| Cleveland | 1810 | 10 | 11 |
| Medina | 1811 | 12 | 13 |
| Dayton | 1812 | 14 | 15 |
| Mount Vernon | 1813 | 50 | 50 |
+--------------+------+----------+-----------+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.