I use (PostgreSQL) 11.8. And I try to provide full text search opportunity by some column. For that I created GIN index with with multiple fields and coalesce. And after my data base growed to 344747 rows in table products I faced with slow executin ny query.
create index ndsprc_swedish_custom_index on products
using GIN(to_tsvector('pg_catalog.swedish',coalesce(name,'')||' '||coalesce(description,'')||' '||coalesce(sku,'')||' '||coalesce(price,0)||' '||coalesce(category,'')||' '||coalesce(brand,'')))
After I to ANALYZE query, I did not found my GIN index. It should be in explain part, how to check this index works or not? And in Navicat index looks like broken or not valid, but maybe I mistake
I excpected something like this: Index Cond: and some information about my how my index used
Mybe I was not correct created this index or something like that. This is my query example. And After I changed count of column in index
to_tsvector('pg_catalog.swedish',products_alias.name||products_alias.price)
execute time down, to 2.546s
but looks like my query not use index
my query:
EXPLAIN ANALYZE
SELECT
products_alias.id,
products_alias.sku,
products_alias.name AS "name",
products_alias.description,
products_alias.category,
products_alias.price,
products_alias.shipping,
products_alias.currency,
products_alias.instock,
products_alias.product_url AS "productUrl",
products_alias.image_url AS "imageUrl",
products_alias.tracking_url AS "trackingUrl",
products_alias.brand,
products_alias.shop,
products_alias.original_price AS "originalPrice",
products_alias.ean,
products_alias.manufacturer_article_number AS "manufacturerArticleNumber",
products_alias.extras,
products_alias.created_at AS "createdAt",
products_alias.brand_relation_id AS "brandRelationId",
products_alias.shop_relation_id AS "shopRelationId",
array_agg(DISTINCT cpt.category_id) AS categoryIds,
COUNT(DISTINCT uip.id) as "numberOfEntries",
ts_rank_cd(to_tsvector('pg_catalog.swedish',coalesce(name,'')||' '||coalesce(description,'')||' '||coalesce(sku,'')||' '||coalesce(price,0)||' '||coalesce(category,'')||' '||coalesce(brand,'')||' '||coalesce(shop,'')), query_search) AS rank
FROM products products_alias
JOIN to_tsquery('pg_catalog.swedish', 'Evy&bodystocking&ns:*|23.70:*|ebbe:*|BABYKLÄDER:*') query_search
ON to_tsvector('pg_catalog.swedish',coalesce(name,'')||' '||coalesce(description,'')||' '||coalesce(sku,'')||' '||coalesce(price,0)||' '||coalesce(category,'')||' '||coalesce(brand,'')||' '||coalesce(shop,'')) @@ query_search
LEFT JOIN product_category cp on cp.product_id = products_alias.id
LEFT JOIN product_category cpt on cpt.product_id = products_alias.id
LEFT JOIN user_ip_product uip on uip.products_id = products_alias.id
WHERE products_alias.id NOT IN (720253)
GROUP BY products_alias.id, query_search.query_search ORDER BY rank DESC
LIMIT 50
and what I had, the most important part - Planning Time: 7.567 ms, Execution Time: 12162.804 ms. Or without LIMIT 50
Planning Time: 1.359 ms, Execution Time: 12210.245 ms.
Limit (cost=95625.56..95625.69 rows=50 width=963) (actual time=12159.833..12159.841 rows=50 loops=1)
-> Sort (cost=95625.56..95642.29 rows=6690 width=963) (actual time=12159.831..12159.834 rows=50 loops=1)
Sort Key: (ts_rank_cd(to_tsvector('swedish'::regconfig, (((((((((((((COALESCE(products_alias.name, ''::character varying))::text || ' '::text) || COALESCE(products_alias.description, ''::text)) || ' '::text) || (COALESCE(products_alias.sku, ''::character varying))::text) || ' '::text) || (COALESCE(products_alias.price, '0'::numeric))::text) || ' '::text) || (COALESCE(products_alias.category, ''::character varying))::text) || ' '::text) || (COALESCE(products_alias.brand, ''::character varying))::text) || ' '::text) || (COALESCE(products_alias.shop, ''::character varying))::text)), query_search.query_search)) DESC
Sort Method: top-N heapsort Memory: 136kB
-> GroupAggregate (cost=93312.70..95403.33 rows=6690 width=963) (actual time=10897.686..12149.352 rows=4336 loops=1)
Group Key: products_alias.id, query_search.query_search
-> Sort (cost=93312.70..93329.43 rows=6690 width=927) (actual time=10897.262..10908.173 rows=11762 loops=1)
Sort Key: products_alias.id, query_search.query_search
Sort Method: external merge Disk: 11480kB
-> Gather (cost=88226.64..90164.63 rows=6690 width=927) (actual time=10830.873..10847.395 rows=11762 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop Left Join (cost=87226.64..88495.63 rows=2788 width=927) (actual time=10824.718..10840.108 rows=3921 loops=3)
-> Nested Loop Left Join (cost=87226.22..87656.05 rows=1417 width=923) (actual time=10824.687..10833.446 rows=2076 loops=3)
-> Merge Left Join (cost=87225.79..87229.45 rows=720 width=923) (actual time=10824.649..10826.193 rows=1446 loops=3)
Merge Cond: (products_alias.id = uip.products_id)
-> Sort (cost=87117.93..87119.73 rows=720 width=919) (actual time=10824.610..10824.846 rows=1445 loops=3)
Sort Key: products_alias.id
Sort Method: quicksort Memory: 2110kB
Worker 0: Sort Method: quicksort Memory: 2114kB
Worker 1: Sort Method: quicksort Memory: 2184kB
-> Nested Loop (cost=0.00..87083.76 rows=720 width=919) (actual time=8.350..10817.477 rows=1445 loops=3)
Join Filter: (to_tsvector('swedish'::regconfig, (((((((((((((COALESCE(products_alias.name, ''::character varying))::text || ' '::text) || COALESCE(products_alias.description, ''::text)) || ' '::text) || (COALESCE(products_alias.sku, ''::character varying))::text) || ' '::text) || (COALESCE(products_alias.price, '0'::numeric))::text) || ' '::text) || (COALESCE(products_alias.category, ''::character varying))::text) || ' '::text) || (COALESCE(products_alias.brand, ''::character varying))::text) || ' '::text) || (COALESCE(products_alias.shop, ''::character varying))::text)) @@ query_search.query_search)
Rows Removed by Join Filter: 113743
-> Parallel Seq Scan on products products_alias (cost=0.00..42767.48 rows=144118 width=887) (actual time=0.104..250.790 rows=115188 loops=3)
Filter: (id <> 720253)
Rows Removed by Filter: 0
-> Function Scan on query_search (cost=0.00..0.01 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=345564)
-> Sort (cost=100.64..104.26 rows=1450 width=8) (actual time=0.033..0.037 rows=16 loops=3)
Sort Key: uip.products_id
Sort Method: quicksort Memory: 25kB
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
-> Seq Scan on user_ip_product uip (cost=0.00..24.50 rows=1450 width=8) (actual time=0.022..0.025 rows=16 loops=3)
-> Index Only Scan using idx_cdfc73564584665a on product_category cp (cost=0.42..0.56 rows=3 width=4) (actual time=0.004..0.004 rows=1 loops=4338)
Index Cond: (product_id = products_alias.id)
Heap Fetches: 6192
-> Index Scan using idx_cdfc73564584665a on product_category cpt (cost=0.42..0.56 rows=3 width=8) (actual time=0.002..0.002 rows=2 loops=6228)
Index Cond: (product_id = products_alias.id)
Planning Time: 7.567 ms
Execution Time: 12162.804 ms
My main quesrtion how to optimization query and how to correct use GIN index because looks like my index not working? :(
Instead of writing this as a weird join, write it as a straightforward WHERE
condition:
FROM products products_alias
LEFT JOIN ...
WHERE to_tsvector('pg_catalog.swedish', ...)
@@ to_tsquery('pg_catalog.swedish', 'Evy&bodystocking&ns:*|23.70:*|ebbe:*|BABYKLÄDER:*')
Side remark: it is easier to write
concat(col1, col2, ...)
than
coalesce(col1, '') || coalesce(col2, '') || ...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.