具有较少列的视图是否更快,它们执行得更好吗?
在此示例CREATE VIEW
中,SELECT
仅具有项目迫切需要的列(表“ d”中的六个列中的两个)。
CREATE VIEW data_with_form_id AS (
SELECT
e.form_id, d.fld, d.val
我不想被短视;我可以轻松添加meta列:
CREATE VIEW data_with_form_id AS (
SELECT
e.form_id, d.*
但是这对视图的性能有什么影响?
我四处搜索“ postgresql视图性能”,其中包含一些select和column的术语,但是仅搜索“ postgresql”和“ performance”会导致大量结果,以至于在SELECT
中找不到有关列数量的视图性能的任何信息是针入(很多领域)的干草堆比例。
当然,获取不需要的列会花费额外的钱,尽管该费用通常可以忽略不计。但是获取的列取决于使用视图的查询,而不取决于视图定义。
但是,向视图添加新列并不困难。
为了扩大获取列的成本,获取第40列比获取第二列要昂贵。提取在TOAST表中脱机存储的过大的列特别昂贵。
表定义+示例数据:
\i tmp.sql
CREATE table eee
( form_id SERIAL NOT NULL PRIMARY KEY
, payload char (500)
);
INSERT INTO eee(payload)
SELECT 'payload_'|| gs::text
FROM generate_series(1,100) gs;
CREATE table ddd
( id SERIAL NOT NULL PRIMARY KEY
, form_id SERIAL NOT NULL REFERENCES eee(form_id)
, fld char (100)
, val char (200)
, trash char (400)
, filth char (800)
);
CREATE INDEX ON ddd(form_id);
INSERT INTO ddd(form_id, fld,val,trash,filth)
SELECT eee.form_id
, 'fld_'|| gs::text
, 'val_'|| gs::text
, 'trash_'|| gs::text
, 'filth_'|| gs::text
FROM eee
JOIN generate_series(1,10) gs ON random() < 0.3
;
VACUUM ANALYZE eee;
VACUUM ANALYZE ddd;
两个视图:
CREATE VIEW v1 AS
SELECT e.form_id
, d.fld, d.val
FROM eee e
JOIN ddd d ON d.form_id = e.form_id
;
CREATE VIEW v0 AS
SELECT e.payload
, d.*
FROM eee e
JOIN ddd d ON d.form_id = e.form_id
;
让我们尝试一下:
\echo v1 complete
EXPLAIN SELECT * FROM v1 ;
\echo v1 three fields
EXPLAIN SELECT form_id,fld, val FROM v1 ;
\echo v0 complete
EXPLAIN SELECT * FROM v0 ;
\echo v0 three fields
EXPLAIN SELECT form_id,fld, val FROM v0 ;
\echo v0 four fields
EXPLAIN SELECT form_id,fld, val,trash FROM v0 ;
输出:
DROP SCHEMA
CREATE SCHEMA
SET
CREATE TABLE
INSERT 0 100
CREATE TABLE
CREATE INDEX
INSERT 0 309
VACUUM
VACUUM
CREATE VIEW
CREATE VIEW
v1 complete
QUERY PLAN
-----------------------------------------------------------------------------------------
Hash Join (cost=5.09..71.03 rows=309 width=309)
Hash Cond: (d.form_id = e.form_id)
-> Seq Scan on ddd d (cost=0.00..65.09 rows=309 width=309)
-> Hash (cost=3.84..3.84 rows=100 width=4)
-> Index Only Scan using eee_pkey on eee e (cost=0.14..3.84 rows=100 width=4)
(5 rows)
v1 three fields
QUERY PLAN
-----------------------------------------------------------------------------------------
Hash Join (cost=5.09..71.03 rows=309 width=309)
Hash Cond: (d.form_id = e.form_id)
-> Seq Scan on ddd d (cost=0.00..65.09 rows=309 width=309)
-> Hash (cost=3.84..3.84 rows=100 width=4)
-> Index Only Scan using eee_pkey on eee e (cost=0.14..3.84 rows=100 width=4)
(5 rows)
v0 complete
QUERY PLAN
---------------------------------------------------------------------
Hash Join (cost=9.25..75.18 rows=309 width=2025)
Hash Cond: (d.form_id = e.form_id)
-> Seq Scan on ddd d (cost=0.00..65.09 rows=309 width=1521)
-> Hash (cost=8.00..8.00 rows=100 width=508)
-> Seq Scan on eee e (cost=0.00..8.00 rows=100 width=508)
(5 rows)
v0 three fields
QUERY PLAN
-----------------------------------------------------------------------------------------
Hash Join (cost=5.09..71.03 rows=309 width=309)
Hash Cond: (d.form_id = e.form_id)
-> Seq Scan on ddd d (cost=0.00..65.09 rows=309 width=309)
-> Hash (cost=3.84..3.84 rows=100 width=4)
-> Index Only Scan using eee_pkey on eee e (cost=0.14..3.84 rows=100 width=4)
(5 rows)
v0 four fields
QUERY PLAN
-----------------------------------------------------------------------------------------
Hash Join (cost=5.09..71.03 rows=309 width=713)
Hash Cond: (d.form_id = e.form_id)
-> Seq Scan on ddd d (cost=0.00..65.09 rows=309 width=713)
-> Hash (cost=3.84..3.84 rows=100 width=4)
-> Index Only Scan using eee_pkey on eee e (cost=0.14..3.84 rows=100 width=4)
(5 rows)
现在查看计划中的“宽度”列:它们有所不同,不仅取决于表和视图定义,而且取决于[[也取决于最终查询。
这是因为,在postgres中,视图是一种macro:在进行任何优化之前,**它会合并到查询计划中。比以后更优化的计划从计划中删除未引用的列,从而导致结果的行大小减小。
来自基表的数据量read
当然是相同的:物理表未更改其行大小。[对于非内部人员请注意:我故意使用CHAR(xxx)
列来增加行大小。 varchar()
列将被烘烤。 (:=放入辅助存储),并且不会夸大行大小。