合并 PostgreSQL 中类似的 EXISTS 条件

问题描述 投票:0回答:1

我想知道是否可以以有意义且高效的方式组合多个相似的存在条件。

让我们假设以下示例:可以将不同的活动分配给一个服务(n-m)。活动可以独立地分为活动组。活动组可以分配给组类型。

如果我现在想要查找引用某些组类型的所有服务,并且想通过

OR
链接条件,那么通过组合
EXISTS
IN
,这相对简单。

select *
from service
where exists (
          select 1
          from activitiy
               join activitiy_activitiy_group
                    on activitiy.id = activitiy_activitiy_group.id_activitiy
                    join activitiy_group
                        on activitiy_activitiy_group.id_activitiy_group = activitiy_group.id
          where (
                        activitiy_group.id_type in (1, 3)
                        and activitiy.id_service = service.id
                    );

另一方面,如果我想通过 AND 来链接条件,那么事情就没那么简单了。我可以添加多个退出条件:

select *
from service
where exists (
          select 1
          from activitiy
               join activitiy_activitiy_group
                    on activitiy.id = activitiy_activitiy_group.id_activitiy
                    join activitiy_group
                        on activitiy_activitiy_group.id_activitiy_group = activitiy_group.id
          where (
                        activitiy_group.id_type = 1
                        and activitiy.id_service = service.id
                    )
and
exists (
          select 1
          from activitiy
               join activitiy_activitiy_group
                    on activitiy.id = activitiy_activitiy_group.id_activitiy
                    join activitiy_group
                        on activitiy_activitiy_group.id_activitiy_group = activitiy_group.id
          where (
                            activitiy_group.id_type = 3
                        and activitiy.id_service = service.id
                    );

但我想知道这种方法对于许多过滤元件是否有效。我进行了一些实验,一种方法是仅使用一个子选择,将与服务相关的所有不同活动组类型 ID 选择到一个数组中,并将其与过滤器值进行比较:

select *
from service
where true =
      (select ARRAY_AGG(activitiy_group.id_type) @> ('{1,3}'::Integer[])
       from activitiy
               join activitiy_activitiy_group
                    on activitiy.id = activitiy_activitiy_group.id_activitiy
                    join activitiy_group
                        on activitiy_activitiy_group.id_activitiy_group = activitiy_group.id
       where ativitiy.id_service = service.id);

但这里也出现了这样的问题:这是否真的有效。 任何人都可以评估这一点,或者是否有更明智的替代方法?我认为底层的基本问题是一个标准问题,但不幸的是在互联网上找不到任何其他方法。

sql postgresql exists postgresql-performance relational-division
1个回答
0
投票

我会选择一个涉及的表很大且写入相对较少的设置。那么创建一个辅助

MATERIALIZED VIEW
(once) 来显着加快查询速度是有意义的:

CREATE MATERIALIZED VIEW service_activity_types AS
SELECT a.id_service, array_agg(ag.id_type) AS activity_types
FROM  (
   SELECT DISTINCT a.id_service, ag.id_type
   FROM   activitiy                 a
   JOIN   activitiy_activitiy_group aag ON aag.id_activitiy = a.id
   JOIN   activitiy_group           ag  ON ag.id = aag.id_activitiy_group
   ORDER  BY 1, 2
   ) sub
GROUP  BY 1;

生成一个包含独特服务和一系列独特活动类型的表。

(在子查询中应用一次

DISTINCT
ORDER BY
通常会更快。)

service_activity_types 上创建

唯一索引
以允许刷新 MV
CONCURRENTLY
:

CREATE UNIQUE INDEX service_activity_type_uni ON service_activity_types (id_service);

对基础表进行有影响的更改后刷新:

REFRESH MATERIALIZED VIEW CONCURRENTLY service_activity_types;

在数组列上创建索引以使查询更快。有多种选择。对于您的情况,我希望使用附加模块

gin__int_ops
 中的运算符类 
intarray
的 GIN 索引是最快的。首先为每个数据库安装一次模块。参见:

CREATE INDEX service_activity_type_gin_idx ON service_activity_types USING gin (activity_types gin__int_ops);

甚至可能是多列索引。参见:

此外,要开始进行列统计:

ANALYZE service_activity_types;

那么您的查询可以是:

SELECT id_service
FROM   service_activity_types
WHERE  activity_types @> '{1,3}';

而且速度会非常快。

© www.soinside.com 2019 - 2024. All rights reserved.