Postgresql位图堆扫描速度慢

Question

我的表看起来像：

create table invoices
(
    id            serial not null,
    data          jsonb,
    modified      date,
    search_string text   not null
);

我需要在ILIKE上用search_string搜索表格。同一请求中可能存在许多不同的搜索查询。

我的查询如下：

SELECT *
FROM invoices
WHERE (
    search_string ILIKE '%1%'
    OR search_string ILIKE '%2%'
    OR search_string ILIKE '%3%'
)

解释没有索引的搜索

Seq Scan on invoices  (cost=0.00..147139.51 rows=1004406 width=1006) (actual time=0.038..2341.489 rows=1004228 loops=1)
   Filter: ((search_string ~~* '%1%'::text) OR (search_string ~~* '%2%'::text) OR (search_string ~~* '%3%'::text))
   Rows Removed by Filter: 1943
 Planning Time: 4.682 ms
 Execution Time: 2427.400 ms

我试图通过创建GIN索引来加快速度：

CREATE EXTENSION pg_trgm;
CREATE INDEX invoices_search_string_trigram_index ON invoices USING gin (search_string gin_trgm_ops);

用索引解释搜索

 Bitmap Heap Scan on invoices_invoice  (cost=414767.41..561902.40 rows=1004149 width=1006) (actual time=14878.331..17862.840 rows=1004228 loops=1)
  Recheck Cond: ((search_string ~~* '%1%'::text) OR (search_string ~~* '%2%'::text) OR (search_string ~~* '%3%'::text))
  Rows Removed by Index Recheck: 1943
  Heap Blocks: exact=63341 lossy=66186
  ->  BitmapOr  (cost=414767.41..414767.41 rows=1006171 width=0) (actual time=14842.199..14842.199 rows=0 loops=1)
        ->  Bitmap Index Scan on trgm_idx_search_string  (cost=0.00..137979.36 rows=874048 width=0) (actual time=4520.466..4520.466 rows=546232 loops=1)
              Index Cond: (search_string ~~* '%1%'::text)
        ->  Bitmap Index Scan on trgm_idx_search_string  (cost=0.00..138208.03 rows=904538 width=0) (actual time=4357.453..4357.453 rows=546232 loops=1)
              Index Cond: (search_string ~~* '%2%'::text)
        ->  Bitmap Index Scan on trgm_idx_search_string  (cost=0.00..137826.91 rows=853721 width=0) (actual time=5964.276..5964.276 rows=546232 loops=1)
              Index Cond: (search_string ~~* '%3%'::text)
Planning Time: 1.198 ms
Execution Time: 17971.102 ms

为什么我的索引搜索速度比seq扫描慢？有没有办法让这种类型的搜索更快？

Answer 1

你的问题可能是66186有损块。增加work_mem直到你只有精确的块。

考虑到你有一百万个结果行，我会说这个查询永远不会非常快，除非你减少结果行的数量。

Answer 2

如何用SIMILAR TO '[123]'代替OR连接的3个ILIKE？这可能是快3倍。

仍然，ILIKE和SIMILAR需要检查每一行。

当您添加INDEX时，您会诱使优化器认为索引会有所帮助。但可能大多数行中有1/2/3，因此索引会成为额外的开销。

顾名思义，Trigrams在连续3个字符匹配时效果最佳。但%1%只检查1个字符。因此，三卦的大部分力量都被浪费了。

Postgresql位图堆扫描速度慢

问题描述投票：2回答：2

2个回答

最新问题

Postgresql位图堆扫描速度慢

问题描述 投票：2回答：2

2个回答

最新问题

问题描述投票：2回答：2