我有一个巨大的单列表,其中引擎=日志:
SELECT * FROM addresses_tmp LIMIT 5
┌─address──────────────────────────────────┐
1. │ 18a0a8bdcbd1fec1785224cfc486ccf02dc3ef5d │
2. │ 3ca0a8d9744b229f81fae2f59892b546c20a744e │
3. │ 4456058ebd1ae161348b5aae51d86aef423513a6 │
4. │ a3230a93a31f924a2713af72733d522873434025 │
5. │ 4960323c0fbd63ae068ea313c67bb2a3bc133baf │
└──────────────────────────────────────────┘
我尝试将其插入到 ReplacingMergeTree 表中:
create table addresses engine=ReplacingMergeTree() primary key address as select row_number() over() as id, * from (select * from addresses_tmp)
但由于内存错误而失败:
代码:241。DB::异常:从本地主机接收:9000。数据库::异常: 超出内存限制(总计):将使用 27.89 GiB(尝试 分配 5248943 字节的块,最大:27.86 GiB。 OvercommitTracker 决策:选择查询停止 过量使用跟踪器。:
我还能如何执行到 MergeTree 或 ReplacingMergeTree 的转换并对表进行重复数据删除?
CREATE TABLE addresses (address String)
ENGINE = ReplacingMergeTree ORDER BY address;
INSERT INTO addresses SELECT * FROM addresses_tmp;
OPTIMIZE TABLE addresses FINAL DEDUPLICATE;