DuckDB OOM 时批量退出

Question

我有一个批处理 DuckDB 脚本，它设置自定义

memory_limit

。但是，当它在一条语句上出现 OOM 时，它似乎不会退出，而是继续并尝试批处理脚本中的其余语句（由于语句失败，这些语句全部失败或毫无意义）。这会在我的日志文件中留下非常混乱的消息，如果说它在

UPDATE

命令上失败，但随后继续执行文件的其余部分，则可能会导致错误。

如果语句失败（例如上面的 OOM），有没有办法强制 DuckDB 立即退出？

示例脚本：

SET memory_limit = '100GB';
SET threads TO 64;

CREATE TABLE summary AS
  SELECT
    point_id,
    COUNT(pred_occ) AS ensemble_support,
    AVG(pred_range) AS pat_mean,
    QUANTILE_CONT(pred_occ, [0.1, 0.5, 0.9]) AS occurrence_quantiles,
    QUANTILE_CONT(pred_count, [0.1, 0.5, 0.9]) AS count_quantiles,
    QUANTILE_CONT(pred_occ * pred_count, [0.1, 0.5, 0.9]) AS abundance_quantiles
  FROM predictions
  GROUP BY point_id;

CREATE TABLE erd AS
  SELECT
    CAST(SPLIT_PART(point_id, '-', 2) AS BIGINT) AS checklist_id,
    SPLIT_PART(point_id, '-', 1) AS type,
    ensemble_support, pat_mean,
    occurrence_quantiles[2] AS occurrence_median,
    occurrence_quantiles[1] AS occurrence_lower,
    occurrence_quantiles[3] AS occurrence_upper,
    count_quantiles[2] AS count_median,
    count_quantiles[1] AS count_lower,
    count_quantiles[3] AS count_upper,
    abundance_quantiles[2] AS abundance_median,
    abundance_quantiles[1] AS abundance_lower,
    abundance_quantiles[3] AS abundance_upper
  FROM summary
  WHERE point_id LIKE 'test-%';

COPY (
  SELECT
    s.*,
    e.latitude, e.longitude, e.year, e.day_of_year, e.observer_id
  FROM erd as s
  INNER JOIN '{input_erd_pq}' as e
    ON s.checklist_id = e.checklist_id
) TO '{predictions_erd_pq}' (FORMAT 'parquet');

COPY (
  SELECT
    CAST(SPLIT_PART(point_id, '-', 2) AS BIGINT) AS srd_id,
    CAST(NULLIF(SPLIT_PART(point_id, '-', 3), '') AS INTEGER) AS day_of_year,
    ensemble_support, pat_mean,
    occurrence_quantiles[2] AS occurrence_median,
    occurrence_quantiles[1] AS occurrence_lower,
    occurrence_quantiles[3] AS occurrence_upper,
    count_quantiles[2] AS count_median,
    count_quantiles[1] AS count_lower,
    count_quantiles[3] AS count_upper,
    abundance_quantiles[2] AS abundance_median,
    abundance_quantiles[1] AS abundance_lower,
    abundance_quantiles[3] AS abundance_upper
  FROM summary
  WHERE point_id LIKE 'srd-%') TO '{predictions_srd_pq}' (FORMAT 'parquet');

错误日志：

Out of Memory Error: Failed to allocate block of 2048 bytes (bad allocation)
Catalog Error: Table with name summary does not exist!
Did you mean "temp.information_schema.schemata"?
LINE 15:   FROM summary
                ^
Catalog Error: Table with name erd does not exist!
Did you mean "temp.information_schema.tables"?
LINE 5:   FROM erd as s
               ^
Catalog Error: Table with name summary does not exist!
Did you mean "temp.information_schema.schemata"?
LINE 15:   FROM summary
                ^

在我看来，它已经尝试运行所有 4 个命令，即使之前的每个命令都失败了！

Answer 1

在 duckdb CLI 中使用

-bail

标志。

$ duckdb -help 2>&1 | grep bail
   -bail                stop after hitting an error

在 DuckDB 1.0.0 上测试，使用

a.sql

：

set memory_limit = '1mb';

select 1;

select * from range(10000) a, range(10000) b order by a.range + b.range;

select 2;

（使用

-csv

仅用于简洁输出）

$ duckdb -csv < a.sql
1
1
Out of Memory Error: could not allocate block of size 256.0 KiB (784.0 KiB/976.5 KiB used)
2
2
$ duckdb -csv -bail < a.sql
1
1
Out of Memory Error: could not allocate block of size 256.0 KiB (784.0 KiB/976.5 KiB used)

DuckDB OOM 时批量退出

问题描述投票：0回答：1

1个回答

最新问题

DuckDB OOM 时批量退出

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1