简体   繁体   中英

Best approach to select most viewed posts from last n hours

I'm using PHP and MYSQL(innodb engine).

As MYSQL reference says, selecting with comparison of one column and ordering by another can't use our considered index.

I have a table named News .

This table has at least 1 million records with two important columns: time_added and number_of_views .

I need to select most viewed records from last n hours. What is the best index to do this? Or is it possible to run this kind of queries very fast for a table with millions of records?

I've already done this for "last day", meaning I can select most viewed records from last day by adding a new column ( date_added ). But if I decide to select these records from last week, I'm in trouble again.

First, write the query:

select n.*
from news n
where time_added >= date_sub(now(), interval <n> hours)
order by number_of_views desc
limit ??;

The best index is (time_added, number_of_views) . Actually, number_of_views won't be used for the full query, but I would include it for other possible queries.

First you must add the following line to the my.cnf (in section

[mysqld]):
query_cache_size = 32M (or more).
query_cache_limit = 32M (or more)

query_cache_size Sets size of the cache

Another option, which should pay attention - this query_cache_limit - it sets the maximum amount of the result of the query, which can be placed in the cache. Check the status of the cache, you can request the following:

show global status like 'Qcache%';

http://dev.mysql.com/doc/refman/5.7/en/mysql-indexes.html

If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to look up rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3). For more information, see http://dev.mysql.com/doc/refman/5.7/en/multiple-column-indexes.html

You need a summary table. Since 'hour' is your granularity, something like this might work:

CREATE TABLE HourlyViews (
    the_hour DATETIME NOT NULL,
    ct SMALLINT UNSIGNED NOT NULL,
    PRIMARY KEY(the_hour)
) ENGINE=InnoDB;

It might need another column (and add it to the PK) if there is some breakdown of the items you are counting. And you might want some other things SUM'd or COUNT'd in this table.

Build and maintain this table incrementally. That is, every hour, add another row to the table. (Or you could keep it updated with INSERT .. ON DUPLICATE KEY UPDATE .. .)

More on Summary Tables

Then change the query to use that table; it will be a lot faster.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM