如何从 DB2 中的分隔数据中提取值?

问题描述 投票:0回答:1

我有这张数据表:

BILL_TYPE                        + BILL_DATE                           + BILL_STATUS
---------------------------------+-------------------------------------+--------------------------------------
EXPECTED]PAYMENT]PAYMENT^PAYMENT + 20230901]20230908]20230915^20230915 + SETTLED]CAPITALISE]CAPITALISE^SETTLED

每个由]或^分隔的值应该成为同一行的一条信息,如下表所示:

BILL TYPE        + BILL_DATE + BILL_STATUS
-----------------+-----------+--------------
EXPECTED           20230901    SETTLED
PAYMENT            20230908    CAPITALISE
PAYMENT            20230915    CAPITALISE
PAYMENT            20230915    SETTLED

我尝试使用以下 XMTABLE 函数作为 ] 和 ^ 分隔符来分隔每个表格单元格中的值。

XMLTABLE('$doc/items/item'
              PASSING XMLPARSE(DOCUMENT CAST('<items><item><value>'||REPLACE(column_name,']','</value></item><item><value>')||'</value></item></items>' as CLOB)) as "doc"
              new_column_name
              ITEM VARCHAR(255) PATH 'value'
          )

结果如下:

BILL TYPE    + BILL DATE + BILL_STATUS
-------------+-----------+--------------
20230901       EXPECTED    SETTLED
20230901       EXPECTED    SETTLED
20230901       EXPECTED    SETTLED
20230901       EXPECTED    SETTLED
20230901       PAYMENT     SETTLED
20230901       PAYMENT     SETTLED
20230901       PAYMENT     SETTLED
20230901       PAYMENT     SETTLED
20230908       EXPECTED    SETTLED
20230908       EXPECTED    SETTLED
20230908       EXPECTED    SETTLED
20230908       EXPECTED    SETTLED
20230908       PAYMENT     SETTLED
20230908       PAYMENT     SETTLED
20230908       PAYMENT     SETTLED
20230908       PAYMENT     SETTLED
20230915       EXPECTED    SETTLED
20230915       EXPECTED    SETTLED
20230915       EXPECTED    SETTLED
20230915       EXPECTED    SETTLED
20230915       PAYMENT     SETTLED
20230915       PAYMENT     SETTLED
20230915       PAYMENT     SETTLED
20230915       PAYMENT     SETTLED
20230901       EXPECTED    CAPITALISE
20230901       EXPECTED    CAPITALISE
20230901       EXPECTED    CAPITALISE
20230901       EXPECTED    CAPITALISE
20230901       PAYMENT     CAPITALISE
20230901       PAYMENT     CAPITALISE
20230901       PAYMENT     CAPITALISE
20230901       PAYMENT     CAPITALISE
20230908       EXPECTED    CAPITALISE
20230908       EXPECTED    CAPITALISE
20230908       EXPECTED    CAPITALISE
20230908       EXPECTED    CAPITALISE
20230908       PAYMENT     CAPITALISE
20230908       PAYMENT     CAPITALISE
20230908       PAYMENT     CAPITALISE
20230908       PAYMENT     CAPITALISE
20230915       EXPECTED    CAPITALISE
20230915       EXPECTED    CAPITALISE
20230915       EXPECTED    CAPITALISE
20230915       EXPECTED    CAPITALISE
20230915       PAYMENT     CAPITALISE
20230915       PAYMENT     CAPITALISE
20230915       PAYMENT     CAPITALISE
20230915       PAYMENT     CAPITALISE

我感觉结果就像是我使用了CROSS JOIN功能,这不是我想要的结果。 因为我的预期结果只有 4 行:

BILL TYPE        + BILL DATE + BILL_STATUS
-----------------+-----------+--------------
EXPECTED           20230901    SETTLED
PAYMENT            20230908    CAPITALISE
PAYMENT            20230915    CAPITALISE
PAYMENT            20230915    SETTLED

我应该使用什么功能?我想我应该使用某种标志,但我不知道要使用什么函数。

这是我的 dbfiddle :https://dbfiddle.uk/tM5LebOM

sql db2
1个回答
0
投票

试试这个:

    with data (s) as (values
('EXPECTED]PAYMENT]PAYMENT^PAYMENT + 20230901]20230908]' concat
 '20230915^20230915 + SETTLED]CAPITALISE]CAPITALISE^SETTLED')
),
     split (block,posi,val) as (
select b,
       p,
       regexp_substr(
           regexp_substr(s,'[^ +]+', 1, b),
           '\w+', 1, p)
from   data
cross  join (values (1),(2),(3),(4)) x (b)
cross  join (values (1),(2),(3),(4),(5)) y (p)
-- 3 blocks separated by [ oder ^
where  b <= regexp_count(s,'[^ +]+')
-- 4 values within one block
and    p <= regexp_count(regexp_substr(s,'[^ +]+', 1, b),'\w+') 
)
select listagg(val,' ') within group (order by posi,block)
from   split
group  by posi
order  by posi
最新问题
© www.soinside.com 2019 - 2025. All rights reserved.