简体   繁体   中英

A SQL puzzle (find the first occurrence of a column value)

CIs there an easy way where I can find the first occurrence of a row that has a particular value in a column? For example suppose I have these two tables

Alphabet
A
B
C
D


Alphabet    Usage
A           Apple
B           Bat
D           Dog
A           Amateur
A           Arsenal
C           Cat
B           Ball
D           Drum

What would be an easy way to select everything in the first table and the first usage of it in the second table?

Expected Output:

Alphabet    Usage
A           Apple
B           Bat
C           Cat
D           Dog

You should be able to apply row_number() . However when using row_number there is an order that needs to be provided.

This first example uses a order by usage but the problem is that that will not be in the order of first in the table, it will be in alphabetical order:

select alphabet, usage
from
(
  select t1.alphabet,
    t2.usage,
    row_number() over(partition by t1.alphabet order by t2.usage) rn
  from table1 t1
  inner join table2 t2
    on t1.alphabet = t2.alphabet
) src
where rn =1

See SQL Fiddle with Demo .

If you do not have a numeric id field to guarantee the order of the first one entered. You might be able to use:

select alphabet, usage
from
(
  select t1.alphabet,
    t2.usage,
    row_number() over(partition by t1.alphabet order by (select 1)) rn
  from table1 t1
  inner join table2 t2
    on t1.alphabet = t2.alphabet
) src
where rn =1

See SQL Fiddle with Demo .

As @Aaron pointed out in the comments, that order is not guaranteed with using this method and the behavior can change.

Ideally, you should have sort type of column that will allow you distinguish the first occurrence of your data, ie datetime, id, etc. Since there is no order on data in a table, you apply the order using order by

You cannot, unless you have an ordering on the second table. SQL Tables are inherently unordered, so you would need a column that specifies an insert time or an auto-incrementing id.

If you happen to be running SQL Server with no parallelism and the data is stored in a single file or if the data in the second table fits on one page, then the following will probably work (but no guarantees):

select au.*
from (select au.Alphabet, min(seqnum) as minseqnum
      from (select au.*, row_number() over (order by (select NULL)) as seqnum
            from AlphabetUsage au
           ) au
      group by au.Alphabet
     ) ausum join
     (select au.*, row_number() over (order by (select NULL)) as seqnum
      from AlphabetUsage au
     ) au
     on ausum.seqnum = au.seqnum

In my experience on SQL Server, row_number() over (order by (select NULL)) assigns a row number without ordering the data. However, this is not documented and not guaranteed.

I highly, highly recommend that you add additional columns to the table, including an identity column to identify each row.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM