Advice for advanced performance tuning - Beyond basic indexing

Question

Summary

I'm planning on storing a list of license plates in a SQL Azure database with the following schema:

Schema

CREATE TABLE [dbo].[events](
    [id] [bigint] IDENTITY(1,1) NOT NULL,
    [dateTimeCreated] [datetime] NOT NULL,
    [registration] [varchar](14) NOT NULL
) ON [PRIMARY]

GO

SET ANSI_PADDING OFF
GO

ALTER TABLE [dbo].[events] ADD  CONSTRAINT [DF_events_dateTimeCreated]  DEFAULT (getdate()) FOR [dateTimeCreated]
GO

I can only think of running the following one query: - Search for registration within a given date/time range

So far I can only think of creating a non-clustered index agaisnt dateTimeCreated and registration

Question

There may end up being 10's of millions of rows. * What options (azure specific or not) are there for improving performance when the row count finally does increase greatly? * Are there any guides on how the query performance will de-grade for a given number of rows?

Answer 1

You should definitely create a clustered index for dateTimeCreated . The registration column should also be indexed but whether (and how) it should be indexed depends on the data: Will your registration have some sequnce to them or will they be random?

Key idea behind Clustered Indexes :

The only time the data rows in a table are stored in sorted order is when the table contains a clustered index.

This means that when you do a search on a column that is clustered and the values have some order-able semantics (your dateTimeCreated column) your likelyhood of fetching the right data goes up significantly. (SQL Server does not have to fetch - as many - table pages to gather the necessary data.)

Also: ( MSDN documentation link )

Microsoft Azure SQL Database does not support tables without clustered indexes. A table must have a clustered index. If a table is created without a clustered constraint, a clustered index must be created before an insert operation is allowed on the table.

Answer 2

I would make ID a PK (and clustered index)

And why bigint?
int goes up to 4 billion (8 billion if you use the negative)
Not just less disc space but you have more records cached in the same amount of memory.

count(*) will be order n
twice as many records will take twice as long to count

As for the other columns create an index if you are going to search or sort on them.

Advice for advanced performance tuning - Beyond basic indexing

Question

Summary

Schema

Question

2 answers

solution1
1 ACCPTED 2015-02-10 17:06:14

solution2
0 2015-02-10 18:13:02

Advice for advanced performance tuning - Beyond basic indexing

Question

Summary

Schema

Question

2 answers

solution1 1 ACCPTED 2015-02-10 17:06:14

solution2 0 2015-02-10 18:13:02

solution1
1 ACCPTED 2015-02-10 17:06:14

solution2
0 2015-02-10 18:13:02