我有三张桌子。每个表包含超过3M行。我运行以下代码:
SELECT * FROM
(
SELECT col_1, col_2, col_3, [date], 1 as type FROM table_1
UNION
SELECT col_1, col_2, col_3, [date], 2 as type FROM table_2
UNION
SELECT col_1, col_2, col_3, [date], 3 as type FROM table_3
) AS tb
tb.[date] BETWEEN (start_date) AND (end_date)
ORDER BY [date] DESC OFFSET n ROWS FETCH NEXT m ROWS ONLY
但是当我得到大的日期间隔时,查询运行得更慢。例如:当我得到2019-01-01和2019-04-01间隔时,查询运行大约13-14秒:
这个结果非常糟糕。我想在1秒内得到结果。我能做什么?
首先使用UNION ALL
而不是UNION
:
SELECT *
FROM (SELECT col_1, col_2, col_3, [date], 1 as type FROM table_1
UNION ALL
SELECT col_1, col_2, col_3, [date], 2 as type FROM table_2
UNION ALL
SELECT col_1, col_2, col_3, [date], 3 as type FROM table_3
) AS tb
WHERE tb.[date] BETWEEN (start_date) AND (end_date)
ORDER BY [date] DESC
OFFSET n ROWS FETCH NEXT m ROWS ONLY;
SQL会导致使用UNION
删除重复项的开销。 UNION ALL
不会产生这种开销。
此外,每个表格中的date
指数应该有所帮助。 SQL Server有一个很好的优化器,通常会将这些条件推送到UNION
/ UNION ALL
子查询中的各个查询。
我建议在每个表上创建一个覆盖索引,类似于:
CREATE INDEX ix1 ON table_1 (date) INCLUDE (column1, column2, column3)
这应该有助于WHERE子句。另外,SQL Server不必触及表格,因为索引中存在所有必需的信息。
这是对此的又一次尝试。假设OFFSET n ROWS FETCH NEXT m ROWS ONLY
在开始日期和结束日期之间匹配很小比例的行,请编写如下查询:
WITH cte1 AS (
-- find the first date after n + m window
SELECT date
FROM (
SELECT date FROM table_1 UNION ALL
SELECT date FROM table_2 UNION ALL
SELECT date FROM table_3
) AS x
WHERE date BETWEEN '2019-01-01' AND '2019-04-01'
ORDER BY date DESC OFFSET (n + m) ROWS FETCH NEXT 1 ROW ONLY
), cte2 AS (
SELECT date, column_1, column_2, column_3, 1 AS type FROM table_1 UNION ALL
SELECT date, column_1, column_2, column_3, 1 AS type FROM table_2 UNION ALL
SELECT date, column_1, column_2, column_3, 1 AS type FROM table_3
)
SELECT *
FROM cte2
WHERE date <= '2019-04-01' AND date > (SELECT date FROM cte1)
ORDER BY date DESC OFFSET n ROWS FETCH NEXT m ROWS ONLY
我不确定查询规划器是否足够聪明,可以通过union之外的where子句来限制union的结果,所以尝试将日期条件移动到union中的每个查询中,这样你就不会联合在运行条件之前,三个表的整体在一起:
SELECT * FROM
(
SELECT col_1, col_2, col_3, [date], 1 as type FROM table_1 where table_1.[date] between (start_date) and (end_date)
UNION
SELECT col_1, col_2, col_3, [date], 2 as type FROM table_2 where table_2.[date] between (start_date) and (end_date)
UNION
SELECT col_1, col_2, col_3, [date], 3 as type FROM table_3 where table_3.[date] between (start_date) and (end_date)
) AS tb
ORDER BY [date] DESC OFFSET n ROWS FETCH NEXT m ROWS ONLY