简体   繁体   中英

How to search a table with 80 million records faster?

I have a table with about 80 million records, I want to find all the activities of lists and workspaces that a user has access to. So first, I get the ids of the lists and workspaces and then I run the following query:

select *, COALESCE("origin_created_at", "created_at") AS "created_at",
  COALESCE("updated_at", "origin_updated_at") AS "updated_at" 
from "activities" 
where ("listId" in (310,214088,219,220,271,222,28434,36046,43233,38236,
  1014787,1017501,1065915,162,399844,399845,395721,824491,400,405,408,
  395873,36,188,178,120,461,1104,27341,27356,83329,29271,158639,482197,
  587679,841589,722320,551,170392,421035,197071,632736,632742,632755,
  632758,673517,155,1231,2691,2695,9092,13783,24273,45765,57909,57938,
  58323,291171,324525,496,5369,54099,54576,98818,569319,1434677,279,
  158821,127,158197,50301,761351,261,438101,159009,643013,158273,58557,
  643867,356252,631758,299145,131,179,156,661,241,260,281,245,438106,
  886,101,72915,90857,144564,166270,230,178981,195046,208561,382159,
  226599,297964,298318,89043,193559,326394,313589,450540,541359,620442,
  323458,628644,643014,261008,650332,689117,847849,672369,932660,382843,
  267000,826590,642775,400339,642875,1282788,1341992,1411789,1515479,
  74018) 
 or "workspaceId" in (137, 81, 111, 424284, 425935, 430658, 84, 163840, 
  3, 4, 281105, 57, 64642, 96660, 38739, 273574, 295312, 79, 213, 
  240478, 424760, 65, 36989)) 
and (("isBulk" = false or "activities"."type" = 0) 
       and "activities"."deprecated_at" is null) 
order by COALESCE("origin_created_at", "created_at") DESC, "id" desc
limit 40;

and this the execution plan

 Limit  (cost=2446886.55..2446886.65 rows=40 width=1002) (actual time=44452.393..44452.418 rows=40 loops=1)
   ->  Sort  (cost=2446886.55..2449439.67 rows=1021250 width=1002) (actual time=44452.391..44452.401 rows=40 loops=1)
         Sort Key: (COALESCE(origin_created_at, created_at)) DESC, id DESC
         Sort Method: top-N heapsort  Memory: 37kB
         ->  Bitmap Heap Scan on activities  (cost=37546.04..2414605.20 rows=1021250 width=1002) (actual time=1043.663..43916.385 rows=568891 loops=1)
               Recheck Cond: (("listId" = ANY ('{310,214088,219,220,271,222,28434,36046,43233,38236,1014787,1017501,1065915,162,399844,399845,395721,824491,400,405,408,395873,36,188,178,120,461,1104,27341,27356,83329,29271,158639,482197,587679,841589,722320,551,170392,421035,197071,632736,632742,632755,632758,673517,155,1231,2691,2695,9092,13783,24273,45765,57909,57938,58323,291171,324525,496,5369,54099,54576,98818,569319,1434677,279,158821,127,158197,50301,761351,261,438101,159009,643013,158273,58557,643867,356252,631758,299145,131,179,156,661,241,260,281,245,438106,886,101,72915,90857,144564,166270,230,178981,195046,208561,382159,226599,297964,298318,89043,193559,326394,313589,450540,541359,620442,323458,628644,643014,261008,650332,689117,847849,672369,932660,382843,267000,826590,642775,400339,642875,1282788,1341992,1411789,1515479,74018}'::integer[])) OR ("workspaceId" = ANY ('{137,81,111,424284,425935,430658,84,163840,3,4,281105,57,64642,96660,38739,273574,295312,79,213,240478,424760,65,36989}'::integer[])))
               Rows Removed by Index Recheck: 9072392
               Filter: ((deprecated_at IS NULL) AND ((NOT "isBulk") OR (type = 0)))
               Rows Removed by Filter: 113630
               Heap Blocks: exact=41259 lossy=271838
               ->  BitmapOr  (cost=37546.04..37546.04 rows=1350377 width=0) (actual time=1032.769..1032.769 rows=0 loops=1)
                     ->  Bitmap Index Scan on activities_list_id_index  (cost=0.00..17333.10 rows=617933 width=0) (actual time=118.412..118.412 rows=507019 loops=1)
                           Index Cond: ("listId" = ANY ('{310,214088,219,220,271,222,28434,36046,43233,38236,1014787,1017501,1065915,162,399844,399845,395721,824491,400,405,408,395873,36,188,178,120,461,1104,27341,27356,83329,29271,158639,482197,587679,841589,722320,551,170392,421035,197071,632736,632742,632755,632758,673517,155,1231,2691,2695,9092,13783,24273,45765,57909,57938,58323,291171,324525,496,5369,54099,54576,98818,569319,1434677,279,158821,127,158197,50301,761351,261,438101,159009,643013,158273,58557,643867,356252,631758,299145,131,179,156,661,241,260,281,245,438106,886,101,72915,90857,144564,166270,230,178981,195046,208561,382159,226599,297964,298318,89043,193559,326394,313589,450540,541359,620442,323458,628644,643014,261008,650332,689117,847849,672369,932660,382843,267000,826590,642775,400339,642875,1282788,1341992,1411789,1515479,74018}'::integer[]))
                     ->  Bitmap Index Scan on activities_workspace_id_index  (cost=0.00..19702.32 rows=732444 width=0) (actual time=914.355..914.355 rows=682628 loops=1)
                           Index Cond: ("workspaceId" = ANY ('{137,81,111,424284,425935,430658,84,163840,3,4,281105,57,64642,96660,38739,273574,295312,79,213,240478,424760,65,36989}'::integer[]))
 Planning time: 2.882 ms
 Execution time: 44452.871 ms
(17 rows)

As stated in the plan PostgreSQL uses "Bitmap Heap Scan" to scan the activities which makes the query slower although both columns are indexed. In total, there are 4 indices on the table, one for each of the following columns: type, listId, workspaceId, organizationId.

How can I make the query faster? Or is there a better way to rewrite the query?

As stated in the plan PostgreSQL uses "Bitmap Heap Scan" to scan the activities which makes the query slower although both columns are indexed.

It is using both of those indexes. The Bitmap used to guide the Heap Scan is based on them, via the BitmapOr.

One possible culprit is here:

Rows Removed by Index Recheck: 9072392
Heap Blocks: exact=41259 lossy=271838

Increase work_mem until the lossy blocks go away. But if the problem is the time to read blocks from disk, that probably won't help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM