Query optimization in Oracle SQL

Question

Let's say I have an oracle database schema like so:
tournaments( id, name ) players( id, name ) gameinfo( id, pid (references players.id), tid (references tournaments.id), date)
So a row in the gameinfo table means that a certain player played a certain game in a tournament on a given date. Tournaments has about 20 records, players about 160 000 and game info about 2 million. I have to write a query which lists tournaments (with tid in the range of 1-4) and the number of players that played their first game ever in that tournament.
I came up with the following query:

select tid, count(pid) from gameinfo g where g.date = (select min(date) from gameinfo g1 where g1.player = g.player) and g.tid in (1,2,3,4) group by tid;

This is clearly suboptimal (it ran for about 58 minutes). I had another idea, that I could make a view of:

select pid, tid, min(date) from gameinfo where tid in(1,2,3,4) group by pid, tid;

And run my queries on this view, as it only had about 600 000 records, but this still seems less than optimal.

Can you give any advice on how this could be optimized ?

Answer 1

My first recommendation is to try analytic functions first. The row_number() function will enumerate the tournaments for each user. The first has a seqnum of 1:

select gi.*
from (select gi.*,
             row_number() over (partition by gi.player order by date) as seqnum
      from gameinfo gi
     ) gi
where tid in(1,2,3,4) and seqnum = 1

My second suggestion is to put the date of the first tournament into the players table, since it seems like important information for using the database.

Query optimization in Oracle SQL

Question

1 answers

solution1
5 ACCPTED 2013-04-27 23:41:38

Query optimization in Oracle SQL

Question

1 answers

solution1 5 ACCPTED 2013-04-27 23:41:38

solution1
5 ACCPTED 2013-04-27 23:41:38