我一直在 Snowflake 中使用 LISTAGG 函数来连接字符串,并触发了以下警告:
100078 (22000): String '(LISTAGG result)' is too long and would be truncated
据我所知,当聚合字符串超过一定长度时会触发此警告。我想知道防止或处理此警告的最佳实践,因为截断很好并且与列的质量无关。我应该提前截断结果吗?如果是的话,怎么办?
SELECT
userid,
NULLIF(LISTAGG(DISTINCT city, ', '), '') AS cities,
NULLIF(LISTAGG(DISTINCT region, ', '), '') AS regions,
...
FROM {{ ref('myschema.table_T') }}
GROUP BY userid
由于聚合字符串达到限制,因此无法使用
LISTAGG
函数。您可以创建一个用户定义的聚合函数:
create or replace function full_array_agg(g string, s string)
returns table (G string, S array)
language javascript
as $$
{
processRow: function f(row, rowWriter, context){
if( this.arr.indexOf(row.S) === -1 ) {
this.arr.push(row.S)
}
this.group = row.G
this.counter++;
}
, initialize: function(argumentInfo, context) {
this.counter = 0;
this.arr = [];
}, finalize: function(rowWriter, context){
rowWriter.writeRow({G:this.group, S: this.arr})
}
}
$$;
你可以这样使用它:
select cities.g as userid, cities.s as cities
from mytable
, table(full_array_agg(
userid::string,
city) over(partition by userid)) cities;
受到这个答案的启发: