简体   繁体   中英

datewise records sorting /fetching from large mySQL database

I have a separate table for every day's data which is basically webstats type : keywords, visits, duration, IP, sale, etc (maybe 100 bytes total per record) Each table will have around a couple of million records.

What I need to do is have a web admin so that the user/admin can view reports for different date periods AND sorted by certain calculated values. For example, the user may want the results for the 15th of last month to the 12th of this month , sorted by SALE/VISIT , descending order.

The admin/user only needs to view (say) the top 200 records at a time and will probably not view more than a few hundred total in any one session

Because of the arbitrary date period involved, I need to sum up the relevant columns for each record and only then can the selection be done.

My question is whether it will be possible to have the reports in real time or would they be too slow (the tables are not rarely - if ever - updated after the day's data has been inserted)

Is such a scenario better fitted to indexes or tablescans?

And also, whether a massive table for all dates would be better than having separate tables for each date (there are almost no joins)

thanks in advance!

With a separate table for each day's data, summarizing across a month is going to involve doing the same analysis on each of 30-odd tables. Over a year, you will have to do the analysis on 365 or so tables. That's going to be a nightmare.

It would almost certainly be better to have a soundly indexed single table than the huge number of tables. Some DBMS support fragmented tables - if MySQL does, fragment the single big table by the date. I would be inclined to fragment by month, especially if the normal queries are for one month or less and do not cross month boundaries. (Even if it involves two months, with decent fragment elimination, the query engine won't have to read most of the data; just the two fragments for the two months. It might be able to do those scans in parallel, even - again, depending on the DBMS.)

Sometimes, it is quicker to do sequential scans of a table than to do indexed lookups - don't simply assume that because the query plan involves a table scan that it will automatically be bad performing.

You may want to try a different approach. I think Splunk will work for you. It was designed for this, they even do ads on this site. They have a free version you can try.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM