简体   繁体   中英

How to Make This SQL Query More Efficient?

I'm not sure how to make the following SQL query more efficient. Right now, the query is taking 8 - 12 seconds on a pretty fast server, but that's not close to fast enough for a Website when users are trying to load a page with this code on it. It's looking through tables with many rows, for instance the "Post" table has 717,873 rows. Basically, the query lists all Posts related to what the user is following (newest to oldest).

Is there a way to make it faster by only getting the last 20 results total based on PostTimeOrder?

Any help would be much appreciated or insight on anything that can be done to improve this situation. Thank you.

Here's the full SQL query (lots of nesting):

SELECT DISTINCT p.Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(p.PostCreationTime) AS PostTimeOrder
                FROM Post p 
                WHERE (p.Id IN (SELECT pc.PostId
                                FROM PostCreator pc
                                WHERE (pc.UserId IN (SELECT uf.FollowedId
                                                    FROM UserFollowing uf
                                                    WHERE uf.FollowingId = '100')
                                                    OR pc.UserId = '100')
                                ))
                OR (p.Id IN (SELECT pum.PostId
                            FROM PostUserMentions pum
                            WHERE (pum.UserId IN (SELECT uf.FollowedId
                                                    FROM UserFollowing uf
                                                    WHERE uf.FollowingId = '100')
                                                    OR pum.UserId = '100')
                            ))  
                OR (p.Id IN (SELECT ssp.PostId
                                FROM SStreamPost ssp
                                WHERE (ssp.SStreamId IN (SELECT ssf.SStreamId
                                                    FROM SStreamFollowing ssf
                                                    WHERE ssf.UserId = '100'))
                                ))
                OR (p.Id IN (SELECT psm.PostId
                                FROM PostSMentions psm
                                WHERE (psm.StockId IN (SELECT sf.StockId
                                                    FROM StockFollowing sf
                                                    WHERE sf.UserId = '100' ))
                                ))



            UNION ALL
            SELECT DISTINCT p.Id AS Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(upe.PostEchoTime) AS PostTimeOrder                 
                FROM Post p
                INNER JOIN UserPostE upe 
                    on p.Id = upe.PostId 
                INNER JOIN UserFollowing uf 
                    on (upe.UserId = uf.FollowedId AND (uf.FollowingId = '100' OR upe.UserId = '100'))  
            ORDER BY PostTimeOrder DESC;    

Changing your p.ID in (...) predicates to existence predicates with correlated subqueries may help. Also since both halves of your union all query are pulling from the Post table and possibly returning nearly identical records you might be able to combine the two into one query by left outer joining to UserPostE and adding upe.PostID is not null as an OR condition in the WHERE clause. UserFollowing will still inner join to UPE. If you want the same Post record twice once with upe.PostEchoTime and once with p.PostCreationTime as the PostTimeOrder you'll need keep the UNION ALL

SELECT 
       DISTINCT -- <<=- May not be needed
       p.Id
     , UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime
     , p.Content AS Content
     , p.Bu AS Bu
     , p.Se AS Se
     , UNIX_TIMESTAMP(coalesce( upe.PostEchoTime
                              , p.PostCreationTime)) AS PostTimeOrder
  FROM Post p 
  LEFT JOIN UserPostE upe 
       INNER JOIN UserFollowing uf 
          on (upe.UserId = uf.FollowedId AND
             (uf.FollowingId = '100' OR
              upe.UserId = '100'))
    on p.Id = upe.PostId 
 WHERE upe.PostID is not null
    or exists (SELECT 1
                 FROM PostCreator pc
                WHERE pc.PostId = p.ID
                  and pc.UserId = '100'
                   or exists (SELECT 1
                                FROM UserFollowing uf
                               WHERE uf.FollowedId = pc.UserID
                                 and uf.FollowingId = '100')
              )
    OR exists (SELECT 1
                 FROM PostUserMentions pum
                WHERE pum.PostId = p.ID
                  and pum.UserId = '100'
                   or exists (SELECT 1
                                FROM UserFollowing uf
                               WHERE uf.FollowedId = pum.UserId
                                 and uf.FollowingId = '100')
              )
    OR exists (SELECT 1
                 FROM SStreamPost ssp
                WHERE ssp.PostId = p.ID
                  and exists (SELECT 1
                                FROM SStreamFollowing ssf
                               WHERE ssf.SStreamId = ssp.SStreamId
                                 and ssf.UserId = '100')
              )
    OR exists (SELECT 1
                 FROM PostSMentions psm
                WHERE psm.PostId = p.ID
                  and exists (SELECT 
                                FROM StockFollowing sf
                               WHERE sf.StockId = psm.StockId
                                 and sf.UserId = '100' )
              )
 ORDER BY PostTimeOrder DESC

The from section could alternatively be rewritten to also use an existence clause with a correlated sub query:

  FROM Post p 
  LEFT JOIN UserPostE upe 
    on p.Id = upe.PostId 
   and ( upe.UserId = '100'
      or exists (select 1
                   from UserFollowing uf
                  where uf.FollwedID = upe.UserID
                    and uf.FollowingId = '100'))

Example of IN :

SELECT  ssp.PostId
    FROM  SStreamPost ssp
    WHERE  (ssp.SStreamId IN (
                SELECT  ssf.SStreamId
                    FROM  SStreamFollowing ssf
                    WHERE  ssf.UserId = '100' ))

-->

SELECT  ssp.PostId
    FROM  SStreamPost ssp
    JOIN  SStreamFollowing ssf  ON ssp.SStreamId = ssf.SStreamId
    WHERE  ssf.UserId = '100'

The big WHERE with all the INs becomes something like

JOIN ( ( SELECT pc.PostId AS id ... )
 UNION ( SELECT pum.PostId ... )
 UNION ( SELECT ssp.PostId ... )
 UNION ( SELECT psm.PostId ... ) )

Get what you can done of that those suggestions, then come back for more advice if you still need it. And bring SHOW CREATE TABLE with you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM