简体   繁体   English

如何根据计数进行更新 - SQL(postgres)

[英]How to do an update based on a count - SQL (postgres)

I have a table, let's call it 'entries' that looks like this (simplified): 我有一张桌子,让我们把它称为'条目'看起来像这样(简化):

id [pk]
user_id [fk]
created [date]
processed [boolean, default false]

and I want to create an UPDATE query which will set the processed flag to true on all entries except for the latest 3 for each user (latest in terms of the created column). 我想创建一个UPDATE查询,它将所有条目的处理标志设置为true,除了每个用户的最新3(最新的创建列)。 So, for the following entries: 因此,对于以下条目:

1,456,2009-06-01,false
2,456,2009-05-01,false
3,456,2009-04-01,false
4,456,2009-03-01,false

Only entry 4 would have it's processed flag changed to true. 只有条目4将处理后的标志更改为true。

Anyone know how I can do this? 谁知道我怎么做到这一点?

I don't know postgres, but this is standard SQL and may work for you. 我不知道postgres,但这是标准的SQL,可能对你有用。

update entries set
  processed = true
where (
  select count(*)
  from entries as E
  where E.user_id = entries.user_id
  and E.created > entries.created
) >= 3

In other words, update the processed column to true whenever there are three or more entries for the same user_id on later dates. 换句话说,只要在以后的日期有相同user_id的三个或更多条目,就将已处理列更新为true。 I'm assuming the [created] column is unique for a given user_id. 我假设[created]列对于给定的user_id是唯一的。 If not, you'll need an additional criterion to pin down what you mean as "latest". 如果没有,你需要一个额外的标准来确定你的意思是“最新”。

In SQL Server you can do this, which is a little easier to follow and will probably be more efficiently executed: 在SQL Server中,您可以执行此操作,这更容易遵循,并且可能会更有效地执行:

with T(id, user_id, created, processed, rk) as (
  select
    id, user_id, created, processed,
    row_number() over (
      partition by user_id
      order by created desc, id
    )
  from entries
)
  update T set
    processed = true
  where rk > 3;

Updating a CTE is a non-standard feature, and not all database systems support row_number. 更新CTE是一项非标准功能,并非所有数据库系统都支持row_number。

First, let's start with query that will list all rows to be updated: 首先,让我们从查询开始,列出要更新的所有行:

select e.id
from entries as e
where (
    select count(*)
    from entries as e2
    where e2.user_id = e.user_id
        and e2.created > e.created
) > 2

This lists all ids of records, that have more than 2 such records that user_id is the same, but created is later than created in row to be returned. 这将列出所有记录的ID,这些记录具有两个以上的记录,即user_id相同,但创建的行晚于要返回的行。

That is it will list all records but last 3 per user. 也就是说,它将列出所有记录,但每个用户最后3个。

Now, we can: 现在,我们可以:

update entries as e
set processed = true
where (
    select count(*)
    from entries as e2
    where e2.user_id = e.user_id
        and e2.created > e.created
) > 2;

One thing thought - it can be slow. 有一点想 - 它可能很慢。 In this case you might be better off with custom aggregate, or (if you're on 8.4) window functions. 在这种情况下,您可能最好使用自定义聚合,或者(如果您使用的是8.4)窗口函数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM