问题
当大部分数据未缓存时,以下查询需要 42 秒:
EXPLAIN (ANALYZE, BUFFERS) select count(*) from packages where company_id = 178381;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=395914.63..395914.63 rows=1 width=8) (actual time=42411.940..42411.942 rows=1 loops=1)
Buffers: shared hit=21775 read=94888
I/O Timings: read=39723.315
-> Bitmap Heap Scan on packages (cost=1053.07..395761.41 rows=306442 width=0) (actual time=83.104..42336.765 rows=322432 loops=1)
Recheck Cond: (company_id = 178381)
Heap Blocks: exact=116385
Buffers: shared hit=21775 read=94888
I/O Timings: read=39723.315
-> Bitmap Index Scan on packages_company_id_index (cost=0.00..1037.75 rows=306442 width=0) (actual time=45.846..45.847 rows=325795 loops=1)
Index Cond: (company_id = 178381)
Buffers: shared hit=1 read=277
I/O Timings: read=7.090
Planning:
Buffers: shared hit=2
Planning Time: 0.237 ms
Execution Time: 42413.042 ms
之后立即再次运行查询自然要快得多:
distru_prod=> EXPLAIN (ANALYZE, BUFFERS) select count(*) from packages where company_id = 178381;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=395914.63..395914.63 rows=1 width=8) (actual time=416.943..416.957 rows=1 loops=1)
Buffers: shared hit=116589
-> Bitmap Heap Scan on packages (cost=1053.07..395761.41 rows=306442 width=0) (actual time=78.925..395.495 rows=322432 loops=1)
Recheck Cond: (company_id = 178381)
Heap Blocks: exact=116308
Buffers: shared hit=116589
-> Bitmap Index Scan on packages_company_id_index (cost=0.00..1037.75 rows=306442 width=0) (actual time=46.359..46.360 rows=325351 loops=1)
Index Cond: (company_id = 178381)
Buffers: shared hit=281
Planning:
Buffers: shared hit=448
Planning Time: 1.375 ms
Execution Time: 418.321 ms
更多信息
以下是有关此表的一些基本信息:
select count(distinct company_id) from packages;
count
-------
691
select count(*) from packages;
count
----------
10764441
select count(*) from packages where company_id = 178381;
count
--------
322432
select pg_size_pretty(pg_total_relation_size('packages'));
pg_size_pretty
----------------
12 GB
select pg_size_pretty(pg_total_relation_size('packages_company_id_index'));
pg_size_pretty
----------------
79 MB
正在使用的索引:
CREATE INDEX packages_company_id_index ON public.packages USING btree (company_id);
即使使用
pg_hint_plan
强制对 packages_company_id_index
进行仅索引扫描,buffers
仍然与执行位图堆扫描时一样高:
/*+ IndexOnlyScan(packages packages_company_id_index) */ EXPLAIN (ANALYZE, BUFFERS) select count(*) from packages where company_id = 178381;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=517401.31..517401.32 rows=1 width=8) (actual time=172.448..172.450 rows=1 loops=1)
Buffers: shared hit=116586
-> Index Only Scan using packages_company_id_index on packages (cost=0.09..517248.09 rows=306442 width=0) (actual time=0.034..150.510 rows=322432 loops=1)
Index Cond: (company_id = 178381)
Heap Fetches: 325351
Buffers: shared hit=116586
Planning:
Buffers: shared hit=2
Planning Time: 0.238 ms
Execution Time: 172.546 ms
问题
buffers
?packages
表就可以完成?仅索引扫描会读取如此多的缓冲区,因为它并不是真正的仅索引扫描。
VACUUM
表格刷新可见性图,您就会看到差异。
可能是因为可见性贴图未更新,也可能是因为您将
random_page_cost
设置为太高的值。
根据您发布的数据,该表的大小应至少为 900 MB(116589 个页面,每个页面 8 kB)。
您希望您的磁盘读取速度超过 19MB/s 吗?
无法回答,因为它取决于一个未知的“如果”。
参见上文 1. 和 2.。