My columns are:
job_name, job_date, job_details1, job_details2 ...
There are NO Primary key columns
In my table, I expect to have 15-20 distinct jobs. Each job with exactly 2 months of data so 60 distinct job_date
per job_name
. And within each date there would be 100,000
records.
Query will always be a SELECT for ONE particular job_name
and a range of job_date
(followed by several group bys, but that's irrelevant for now). I don't want the query to go through irrelevant job_date
s or job_name
s when queried for a particular job_name
and some range of job_date
.
So what sort of optimizations can I do to make my select query faster? I'm using MySQL5.6.17, which has a partitioning limit of 8096 partitions.
Something like partitioning per job_name
and subpartitions for job_date
within that? This is the first time I'm dealing with such large data so I'm not sure about these optimizations. Any help or tips will be appreciated.
Thanks
"Query will always be a SELECT for ONE particular job_name and a range of job_date (followed by several group bys, but that's irrelevant for now)." -- Based on that, you need
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
PRIMARY KEY(job_name, job_date, id),
INDEX(id)
ENGINE=InnoDB
Notes:
AUTO_INCREMENT
and adding it to the PK because a PK must be unique. (And the PK is needed for the clustering.) INDEX(id)
(or some key starting with id
) is needed for AUTO_INCREMENT
. "... followed by group bys ..." That sounds like you are summarizing data for reports? If my suggestions above are not fast enough, let's talk about Summary Tables . You might get another factor of 10 speedup.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.