我有这样的疑问:
SET @data_atual := '2018-11-30';
SET @data_ant := '2018-10-31';
select a.data_competencia as data, a.comprador, a.gest_comprador, a.atv_comprado as ativo, a.gest_comprado, sum(a.vl_atv_comprado) as valor,
if(sum(c.vl_atv_comprado) is null, 0, sum(c.vl_atv_comprado)) as vl_anterior, b.rent_mtd as rent,
if(sum(c.vl_atv_comprado) is null, sum(a.vl_atv_comprado), sum(c.vl_atv_comprado)) * (1 + b.rent_mtd) as vl_atualizado,
sum(a.vl_atv_comprado) - if(sum(c.vl_atv_comprado) is null, sum(a.vl_atv_comprado), sum(c.vl_atv_comprado)) * (1 + b.rent_mtd) as capt_liq from tabposicaoindustria as a
inner join tabcotasindustria as b on a.atv_comprado = b.fundo and a.data_competencia = b.data
left join tabposicaoindustria as c on a.comprador = c.comprador and a.atv_comprado = c.atv_comprado and c.data_competencia = @data_ant
where a.data_competencia = @data_atual
group by a.data_competencia, a.comprador, a.gest_comprador, a.atv_comprado, a.gest_comprado
order by a.data_competencia, a.comprador, a.gest_comprador, a.atv_comprado, a.gest_comprado
It works fine, however it is too slow. Considering there are over 2mm rows on tabposicaoindustria, is there a faster way to run it? This one takes over than 6 hours!!
谢谢你。
结果相同,但速度更快。
首先,您的查询根本不可读。 我重新格式化了您的查询,以便更好地理解它,但是我仍然缺少有关您尝试实现的目标的关键信息(您的命名很难理解)。
SET @data_atual := '2018-11-30';
SET @data_ant := '2018-10-31';
SELECT a.data_competencia as data,
a.comprador,
a.gest_comprador,
a.atv_comprado as ativo,
a.gest_comprado,
sum(a.vl_atv_comprado) as valor,
if(sum(c.vl_atv_comprado) is null, 0, sum(c.vl_atv_comprado)) as vl_anterior,
b.rent_mtd as rent,
if(sum(c.vl_atv_comprado) is null,
sum(a.vl_atv_comprado),
sum(c.vl_atv_comprado))
* (1 + b.rent_mtd) as vl_atualizado,
sum(a.vl_atv_comprado) -
if(sum(c.vl_atv_comprado) is null,
sum(a.vl_atv_comprado),
sum(c.vl_atv_comprado))
* (1 + b.rent_mtd) as capt_liq
FROM (tabposicaoindustria as a INNER JOIN tabcotasindustria as b
on a.atv_comprado = b.fundo and a.data_competencia = b.data)
LEFT JOIN tabposicaoindustria as c
on a.comprador = c.comprador and a.atv_comprado = c.atv_comprado and c.data_competencia = @data_ant
WHERE a.data_competencia = @data_atual
GROUP BY a.data_competencia, a.comprador, a.gest_comprador, a.atv_comprado, a.gest_comprado
ORDER BY a.data_competencia, a.comprador, a.gest_comprador, a.atv_comprado, a.gest_comprado
现在回答你的问题。 在这些情况下,几乎所有时间您都在做不必要的额外工作 - 以确定是否是这种情况,您需要更好地解释业务需求。 第二个选项是您的查询很复杂,因此执行计划没有得到很好的优化(根据我的经验,它往往发生在结合 Join 和 Group by 的查询中)。有时订单会关闭。
如果没有更多信息,我想因为您按列分组,所以您也加入了执行顺序不正确。 当然,@Barmar 在 join + group by 列上添加索引的建议是正确的,但可能还不够。 但另一种选择(除了索引之外)是使用子查询重写 - 这样执行计划可能会更加优化。
此外,由于您说这是一个长时间运行的过程,并且它应该在一个大表上运行,另一个选择(如果一切都失败)是将查询的临时“阶段”手动存储到临时表(或者甚至使用物化视图) ) - 在这种情况下“缓存”连接的表。