我有一个未优化的查询,其中第一个表中的 6 列和第二个表中的另外一列上有
ILIKE
过滤器。
表中有超过1300万条记录。此请求将在 80 秒内处理。这是一个很长的时间。由于使用ILIKE,索引不起作用。如何优化我的查询?也许我需要为所有字段创建全文索引?
DDL 和测试数据 https://dbfiddle.uk/JCZVw0v6
版本 PostgreSQL - 10.21
查询
from "sms"
left join "user_apps" on "user_apps"."id" = "sms"."user_app_id"
where "user_apps"."unique_id" ilike '%search_text%'
or "sms.sender" ilike '%search_text%'
or "sms.message" ilike '%search_text%'
or "sms.msisdn_receiver" ilike '%search_text%'
or "sms"."country_code" ilike '%search_text%'
or CAST(sms.id as VARCHAR(255)) ilike '%search_text%'
or "sms.sim" ilike '%search_text%'
or "sms"."status" ilike '%search_text%'
and "sms.type" = 'sms'
order by "smsId" desc
limit 51 offset 0
基数 - 1300 万条记录
解释(分析、缓冲区)
EXPLAIN (ANALYZE,BUFFERS) select *, "user_apps"."id" as "uaId", "sms"."id" as "smsId"
from "sms"
left join "user_apps" on "user_apps"."id" = "sms"."user_app_id"
where "user_apps"."unique_id" ilike '%13568034%'
or "sms.sender" ilike '%13568034%'
or "sms.message" ilike '%13568034%'
or "sms.msisdn_receiver" ilike '%13568034%'
or "sms"."country_code" ilike '%13568034%'
or CAST(sms.id as VARCHAR(255)) ilike '%13568034%'
or "sms.sim" ilike '%13568034%'
or "sms"."status" ilike '%13568034%'
and "sms.type" = 'sms'
order by "smsId" desc
limit 51 offset 0
Limit (cost=1000.77..17388.49 rows=51 width=1349) (actual time=35122.310..35359.126 rows=1 loops=1)
Buffers: shared hit=24704683 read=647586
-> Gather Merge (cost=1000.77..1754487.38 rows=5457 width=1349) (actual time=35122.305..35359.116 rows=1 loops=1)
Workers Planned: 4
Workers Launched: 4
Buffers: shared hit=24704683 read=647586
-> Nested Loop Left Join (cost=0.71..1752837.34 rows=1364 width=1341) (actual time=28087.631..28087.696 rows=0 loops=5)
Filter: (((user_apps.unique_id)::text ~~* '%13568034%'::text) OR ((sms.sender)::text ~~* '%13568034%'::text) OR ((sms.message)::text ~~* '%13568034%'::text) OR ((sms.msisdn_receiver)::text ~~* '%13568034%'::text) OR ((sms.country_code)::text ~~* '%13568034%'::text) OR (((sms.id)::character varying(255))::text ~~* '%13568034%'::text) OR ((sms.sim)::text ~~* '%13568034%'::text) OR (((sms.status)::text ~~* '%13568034%'::text) AND ((sms.type)::text = 'sms'::text)))
Rows Removed by Filter: 2695161
Buffers: shared hit=24704683 read=647586
-> Parallel Index Scan Backward using sms_pkey on sms (cost=0.43..618823.77 rows=3410216 width=229) (actual time=0.092..7920.969 rows=2695161 loops=5)
Buffers: shared hit=7796122 read=647570
-> Index Scan using user_apps_pkey on user_apps (cost=0.28..0.29 rows=1 width=1112) (actual time=0.001..0.001 rows=0 loops=13475805)
Index Cond: (id = sms.user_app_id)
Buffers: shared hit=16908561 read=16
Planning time: 9.966 ms
Execution time: 35359.380 ms
初步答复待索取信息。
第一个问题是你的过时的 Postgres 版本。 Postgres 10 已于 2022 年末终止生命,您甚至还没有使用最新的版本 10.23。 升级到当前版本。在任何情况下都会对您的情况有所帮助。 (但这不是解决方案。)
先解决您的疑问。
似乎您的查询与运算符优先级冲突。通过添加括号来修复。
这没有任何意义:
or CAST(sms.id as VARCHAR(255)) ilike '%search_text%'
没有大写数字,
varchar(255)
是一个误解。参见:
其中一个更有意义:
or sms.id::text LIKE '%search_text%'
or sms.id::text ~ 'search_text'
初步:
SELECT *, u.id AS "uaId", s.id AS "smsId"
FROM sms s
LEFT JOIN user_apps u ON u.id = s.user_app_id
WHERE s.type = 'sms' -- assuming s.type
AND (u.unique_id ILIKE '%13568034%'
OR sender ILIKE '%13568034%' -- u.sender or s.sender ???
OR message ILIKE '%13568034%'
OR msisdn_receiver ILIKE '%13568034%'
OR s.country_code ILIKE '%13568034%'
OR s.id::text LIKE '%13568034%' -- !
OR sim ILIKE '%13568034%'
OR s.status ILIKE '%13568034%') -- parentheses required !!!
ORDER BY s.id DESC
LIMIT 51
OFFSET 0;
还没有解决方案。我不会尝试根据不完整的信息进行优化...