简体   繁体   中英

Return all rows matching customer_id on the first row of the table

I can do two select queries:

  1. select the top row customer_id
  2. select all rows that match the above customer_id

But instead, I want to know if I can do the above in a single query?

I know it's possible using a subquery as in

SELECT * FROM customers
WHERE cust_id=(SELECT cust_id FROM customers LIMIT 1)

Is this an efficient way, or there is something better?

The requirement is to archive records that belong to a random cust_id or it might be correct for it to be the oldest cust_id (using aws lambda function in Python which runs periodically). Hence, I would like to fetch a cust_id and every other row with the same cust_id in a single transaction.

Using Postgres 10.5, with the DB table definition as below.

id BIGINT PRIMARY KEY,
cust_id VARCHAR(100) NOT NULL,
created TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
cust_data JSONB

Sample input

|id|cust_id|     cust_data          |     created        |
|--|-------|------------------------|--------------------|
|8 |cust9  |{"a": {"b": "c"}}       |25/11/2022  01:05:39|
|25|cust1  |{"x": "y"}              |05/12/2022  14:59:21|
|40|cust9  |{"d": {"b": {"c": "z"}}}|07/12/2022  11:29:14|
|87|cust2  |{"r": {"s": "t"}}       |13/12/2022  21:10:18|
|99|cust1  |{"p": "q"}              |20/12/2022  14:59:21|

Expected output

|id|cust_id|     cust_data          |     created        |
|--|-------|------------------------|--------------------|
|8 |cust9  |{"a": {"b": "c"}}       |25/11/2022  01:05:39|
|40|cust9  |{"d": {"b": {"c": "z"}}}|07/12/2022  11:29:14|

OR

Expected output

|id|cust_id|     cust_data          |     created        |
|--|-------|------------------------|--------------------|
|25|cust1  |{"x": "y"}              |05/12/2022  14:59:21|
|99|cust1  |{"p": "q"}              |20/12/2022  14:59:21|

OR

Expected output

|id|cust_id|     cust_data          |     created        |
|--|-------|------------------------|--------------------|
|87|cust2  |{"r": {"s": "t"}}       |13/12/2022  21:10:18|

During one query, we only want one of these expected outputs. Also need to mention that the created here is not the actual timestamp when this entry was created, as the entries made into this table are copied from another. So I think we can't decide which is the cust_id that has got the oldest entry.

What you are doing is correct.

Just make sure to use ORDER BY to ensure that the cust_id is not random.

SELECT * FROM customers
WHERE cust_id=(SELECT cust_id FROM customers ORDER BY DATE LIMIT 1)

Use the LIMIT and OFFSET clauses

SELECT * FROM customers
WHERE cust_id=(SELECT cust_id FROM customers LIMIT 1 OFFSET 0)

Or Use the MIN() function to get the minimum value of the cust_id column

SELECT * FROM customers
WHERE cust_id=(SELECT MIN(cust_id) FROM customers)

To get all rows for a single, arbitrary customer in a single query, WITH TIES in Postgres 13 or later should be most efficient:

SELECT *
FROM   customers
ORDER  BY cust_id
FETCH  FIRST 1 ROWS WITH TIES;

If the table is big, support this query with an index on (cust_id) .

See:

Postgres 10 has reach EOL in 2022 , so you need to upgrade ASAP anyway.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM