更快地进行左连接查询

问题描述 投票:0回答:1

我有这样的疑问:

    SET @data_atual := '2018-11-30';
    SET @data_ant := '2018-10-31';
    
    select a.data_competencia as data, a.comprador, a.gest_comprador, a.atv_comprado as ativo, a.gest_comprado, sum(a.vl_atv_comprado) as valor,
    if(sum(c.vl_atv_comprado) is null, 0, sum(c.vl_atv_comprado)) as vl_anterior, b.rent_mtd as rent, 
    if(sum(c.vl_atv_comprado) is null, sum(a.vl_atv_comprado), sum(c.vl_atv_comprado)) * (1 + b.rent_mtd) as vl_atualizado,
    sum(a.vl_atv_comprado) - if(sum(c.vl_atv_comprado) is null, sum(a.vl_atv_comprado), sum(c.vl_atv_comprado)) * (1 + b.rent_mtd) as capt_liq from tabposicaoindustria as a
    inner join tabcotasindustria as b on a.atv_comprado = b.fundo and a.data_competencia = b.data
    left join tabposicaoindustria as c on a.comprador = c.comprador and a.atv_comprado = c.atv_comprado and c.data_competencia = @data_ant
    where a.data_competencia = @data_atual
    group by a.data_competencia, a.comprador, a.gest_comprador, a.atv_comprado, a.gest_comprado
    order by a.data_competencia, a.comprador, a.gest_comprador, a.atv_comprado, a.gest_comprado

It works fine, however it is too slow. Considering there are over 2mm rows on tabposicaoindustria, is there a faster way to run it? This one takes over than 6 hours!!

谢谢你。

结果相同,但速度更快。

sql mysql database left-join
1个回答
0
投票

首先,您的查询根本不可读。 我重新格式化了您的查询,以便更好地理解它,但是我仍然缺少有关您尝试实现的目标的关键信息(您的命名很难理解)。

SET @data_atual := '2018-11-30';
SET @data_ant := '2018-10-31';
    
SELECT a.data_competencia as data,
       a.comprador, 
       a.gest_comprador, 
       a.atv_comprado as ativo, 
       a.gest_comprado, 
       sum(a.vl_atv_comprado) as valor,
       if(sum(c.vl_atv_comprado) is null, 0, sum(c.vl_atv_comprado)) as vl_anterior,
       b.rent_mtd as rent, 
       if(sum(c.vl_atv_comprado) is null,
              sum(a.vl_atv_comprado),
              sum(c.vl_atv_comprado))
           * (1 + b.rent_mtd) as vl_atualizado,
       sum(a.vl_atv_comprado) -
       if(sum(c.vl_atv_comprado) is null,
              sum(a.vl_atv_comprado),
              sum(c.vl_atv_comprado))
           * (1 + b.rent_mtd) as capt_liq
FROM (tabposicaoindustria as a INNER JOIN tabcotasindustria as b
          on a.atv_comprado = b.fundo and a.data_competencia = b.data) 
      LEFT JOIN tabposicaoindustria as c 
          on a.comprador = c.comprador and a.atv_comprado = c.atv_comprado and c.data_competencia = @data_ant
WHERE a.data_competencia = @data_atual
GROUP BY a.data_competencia, a.comprador, a.gest_comprador, a.atv_comprado, a.gest_comprado
ORDER BY a.data_competencia, a.comprador, a.gest_comprador, a.atv_comprado, a.gest_comprado

现在回答你的问题。 在这些情况下,几乎所有时间您都在做不必要的额外工作 - 以确定是否是这种情况,您需要更好地解释业务需求。 第二个选项是您的查询很复杂,因此执行计划没有得到很好的优化(根据我的经验,它往往发生在结合 Join 和 Group by 的查询中)。有时订单会关闭。

如果没有更多信息,我想因为您按列分组,所以您也加入了执行顺序不正确。 当然,@Barmar 在 join + group by 列上添加索引的建议是正确的,但可能还不够。 但另一种选择(除了索引之外)是使用子查询重写 - 这样执行计划可能会更加优化。

此外,由于您说这是一个长时间运行的过程,并且它应该在一个大表上运行,另一个选择(如果一切都失败)是将查询的临时“阶段”手动存储到临时表(或者甚至使用物化视图) ) - 在这种情况下“缓存”连接的表。

© www.soinside.com 2019 - 2024. All rights reserved.