我在 SQL Server 中有一个数百万结果查询,我需要将其结果放入新表中。使用
INTO dbo.new_table
可以完成这项工作,但是当记录数达到数百万时,需要几个小时。
有没有一种批量/循环数据加载的方法
SELECT INTO dbo.new_table
?
--15 million records returned
--40 columns
SELECT
col,
col,
col,
col_n
INTO dbo.new_table
FROM tbl LEFT JOIN tbl_a, LEFT JOIN tbl_b LEFT JOIN tbl_n
ORDER BY 1
--would it be possible to FETCH/LIMIT batch while using INTO?
统计IO(完成100条记录,1500万条结果需要近2小时)
Total logical reads = 2,641
(100 rows affected)
Table 'one'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'two'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'three'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'four'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'five'. Scan count 0, logical reads 415, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'six'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'seven'. Scan count 0, logical reads 415, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'eight'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'nine'. Scan count 1, logical reads 11, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
设置统计时间:
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.
SQL Server parse and compile time:
CPU time = 12 ms, elapsed time = 12 ms.
(100 rows affected)
请检查执行计划是否有需要改进的地方,例如创建索引。
但是如果你想使用循环批量插入,你可以尝试下面的例子:
DECLARE @id_control INT
DECLARE @batchSize INT
DECLARE @results INT
SET @results = 1
SET @batchSize = 1000000
SET @id_control = 0
WHILE (@results > 0)
BEGIN
-- put your custom code here
insert INTO dbo.new_table
SELECT
col,
col,
col,
col_n
FROM tbl LEFT JOIN tbl_a LEFT JOIN tbl_b LEFT JOIN tbl_n
idcol <= @id_control + @batchSize
ORDER BY idcol
-- very important to obtain the latest rowcount to avoid infinite loops
SET @results = @@ROWCOUNT
-- next batch
SET @id_control = @id_control + @batchSize
END
您需要使用用于排序数据的唯一列名称,而不是
idcol
。