简体   繁体   中英

How should I Query this in mysql

I have a web app in which I show a series of posts based on this table schema (there are thousands of rows like this and other columns too (removed as not required for this question)) :-

+---------+----------+----------+
|   ID    |   COL1   |   COL2   |
+---------+----------+----------+
|   1     |    NULL  |   ----   |
|   2     |    ---   |   NULL   |
|   3     |    NULL  |   ----   |
|   4     |    ---   |   NULL   |
|   5     |    NULL  |   NULL   |
|   6     |    ---   |   NULL   |
|   7     |    NULL  |   ----   |
|   8     |    ---   |   NULL   |
+---------+----------+----------+

And I use this query :-

SELECT * from `TABLE` WHERE `COL1` IS NOT NULL AND `COL2` IS NULL ORDER BY `COL1`;

And the resultant result set I get is like:-

+---------+----------+----------+
|   ID    |   COL1   |   COL2   |
+---------+----------+----------+
|   12    |    ---   |   NULL   |
|   1     |    ---   |   NULL   |
|   6     |    ---   |   NULL   |
|   8     |    ---   |   NULL   |
|  11     |    ---   |   NULL   |
|  13     |    ---   |   NULL   |
|   5     |    ---   |   NULL   |
|   9     |    ---   |   NULL   |
|   17    |    ---   |   NULL   |
|   21    |    ---   |   NULL   |
|   23    |    ---   |   NULL   |
|   4     |    ---   |   NULL   |
|   32    |    ---   |   NULL   |
|   58    |    ---   |   NULL   |
|   61    |    ---   |   NULL   |
|   43    |    ---   |   NULL   |
+---------+----------+----------+

Notice that the IDs column is jumbled thanks to the order by clause.

I have proper indexes to optimize these queries. Now, let me explain the real problem. I have a lazy-load kind of functionality in my web-app. So, I display around 10 posts per page by using a LIMIT 10 after the query for the first page.

We are good till here. But, the real problem comes when I have to load the second page. What do I query now? I do not want the posts to be repeated. And there are new posts coming up almost every 15 seconds which make them go on top(by top I literally mean the first row) of the resultset(I do not want to display these latest posts in the second or third pages but they alter the resultset size so I cannot use LIMIT 10,10 for the 2nd page and so on as the posts will be repeated.).

Now, all I know is the last ID of the post that I displayed. Say 21 here. So, I want to display the posts of IDs 23,4,32,58,61,43 (refer to the resultset table above). Now, do I load all the rows without using the LIMIT clause and display 10 ids occurring after the id 21 . But for that I will have to interate over thousands of useless rows.But, I cannot use a LIMIT clause for the 2nd,3rd... pages that is for sure. Also, the IDs are jumbled, so I can definitely not use WHERE ID>... . So, where do we go now?

I'm not sure if I've understood your question correctly, but here's how I think I would do it:

  • Add a timestamp column to your table, let's call it date_added
  • When displaying the first page , use your query as-is (with LIMIT 10 ) and hang on to the timestamp of the most recent record; let's call it last_date_added .
  • For the 2nd, 3rd and subsequent pages , modify your query to filter out all records with date_added > last_date_added , and use LIMIT 10, 10 , LIMIT 20, 10 , LIMIT 30, 10 and so on.

This will have the effect of freezing your resultset in time, and resetting it every time the first page is accessed.

Notes:

  • Depending on the ordering of your resultset, you might need a separate query to obtain the last_date_added . Alternatively, you could just cut off at the current time, ie the time when the first page was accessed.
  • If your IDs are sequential, you could use the same trick with the ID.

Hmm.. I thought for a while and came up with 2 solutions. :-

  1. To store the Ids of the post already displayed and query WHERE ID NOT IN(id1,id2,...) . But, that would cost you extra memory. And if the user loads 100 pages and the ids are in 100000s then a single GET request would not be able to handle it. At least not in all browsers. A POST request can be used.

  2. Alter the way you display posts from COL1 . I don't know if this would be a good way for you. But, it can save you bandwith and make your code cleaner. It may also be a better way. I would suggest this :- SELECT * from TABLE where COL1 IS NOT NULL AND COL2 IS NULL AND Id>.. ORDER BY ID DESC LIMIT 10,10 . This can affect the way you display your posts by leaps and bounds. But, as you said in your comments that you check if a post meets a criteria and change the COL1 from NULL to the current timestammp, I guess that the newer the posts the, the more above you want to display them. It's just an idea.

I assume new posts will be added with a higher ID than the current max ID right? So couldn't you just run your query and grab the current max ID. Then when you query for page 2 do the same query but with "ID < max_id". This should give you the same result set as your page 1 query because any new rows will have ID > max_id. Hope that helps?

How about?

ORDER BY `COL1`,`ID`;

This would always put IDs in order. This will let you use:

LIMIT 10,10

for your second page.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM