[英]How to do an update based on a count - SQL (postgres)
I have a table, let's call it 'entries' that looks like this (simplified): 我有一张桌子,让我们把它称为'条目'看起来像这样(简化):
id [pk]
user_id [fk]
created [date]
processed [boolean, default false]
and I want to create an UPDATE query which will set the processed flag to true on all entries except for the latest 3 for each user (latest in terms of the created column). 我想创建一个UPDATE查询,它将所有条目的处理标志设置为true,除了每个用户的最新3(最新的创建列)。 So, for the following entries:
因此,对于以下条目:
1,456,2009-06-01,false
2,456,2009-05-01,false
3,456,2009-04-01,false
4,456,2009-03-01,false
Only entry 4 would have it's processed flag changed to true. 只有条目4将处理后的标志更改为true。
Anyone know how I can do this? 谁知道我怎么做到这一点?
I don't know postgres, but this is standard SQL and may work for you. 我不知道postgres,但这是标准的SQL,可能对你有用。
update entries set
processed = true
where (
select count(*)
from entries as E
where E.user_id = entries.user_id
and E.created > entries.created
) >= 3
In other words, update the processed column to true whenever there are three or more entries for the same user_id on later dates. 换句话说,只要在以后的日期有相同user_id的三个或更多条目,就将已处理列更新为true。 I'm assuming the [created] column is unique for a given user_id.
我假设[created]列对于给定的user_id是唯一的。 If not, you'll need an additional criterion to pin down what you mean as "latest".
如果没有,你需要一个额外的标准来确定你的意思是“最新”。
In SQL Server you can do this, which is a little easier to follow and will probably be more efficiently executed: 在SQL Server中,您可以执行此操作,这更容易遵循,并且可能会更有效地执行:
with T(id, user_id, created, processed, rk) as (
select
id, user_id, created, processed,
row_number() over (
partition by user_id
order by created desc, id
)
from entries
)
update T set
processed = true
where rk > 3;
Updating a CTE is a non-standard feature, and not all database systems support row_number. 更新CTE是一项非标准功能,并非所有数据库系统都支持row_number。
First, let's start with query that will list all rows to be updated: 首先,让我们从查询开始,列出要更新的所有行:
select e.id
from entries as e
where (
select count(*)
from entries as e2
where e2.user_id = e.user_id
and e2.created > e.created
) > 2
This lists all ids of records, that have more than 2 such records that user_id is the same, but created is later than created in row to be returned. 这将列出所有记录的ID,这些记录具有两个以上的记录,即user_id相同,但创建的行晚于要返回的行。
That is it will list all records but last 3 per user. 也就是说,它将列出所有记录,但每个用户最后3个。
Now, we can: 现在,我们可以:
update entries as e
set processed = true
where (
select count(*)
from entries as e2
where e2.user_id = e.user_id
and e2.created > e.created
) > 2;
One thing thought - it can be slow. 有一点想 - 它可能很慢。 In this case you might be better off with custom aggregate, or (if you're on 8.4) window functions.
在这种情况下,您可能最好使用自定义聚合,或者(如果您使用的是8.4)窗口函数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.