I have a table with stock prices. The schema looks like this:
In table 'data_1d' there is a unique index ticker_timestamp for ticker_id and timestamp. And a primary index on timestamp_id.
There are ~6.3M rows in 'data_1d'.
This query takes 4+ secs:
select * from data_1d where timestamp_id=1387 and open_close>20
Explain:
And it's 20-30 secs if I search for a range of timestamps.
If I search for by only one criteria, timestamp or open_close, it takes 0.1-0.6 secs.
For example:
select * from data_1d where timestamp_id=1387
OR
select * from data_1d where and open_close>20
What can I do to improve the performance here?
Thanks.
EDIT: I didn't use statement to create the tables but they should be understandable from the schema. But these are the keys being used in them
tickers primary key: id
timestamps_1d primary key: id unique index: timestamp
data_1d ticker_id - references tickers.id timestamp_id - references timestamps_1d.id unique index (or 2 cols, ticker_id and timestamp_id): ticker_timestamps
Do not "normalize" continuous values, such as timestamps, floats, dates, etc.
"And it's 20-30 secs if I search for a range of timestamps." -- show us the query. For that matter, to help you in making the write schema and indexes, let's see samples of all the important queries.
"unique index: timestamp" -- Surely you have two different stocks with the same timestamp??
There is at least one ticker that won't fit in DECIMAL(7,2)
: NYSE:BRKA. INT
won't suffice for an occasional Volume.
select * from data_1d where and open_close>20
has a syntax error.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.