Oracle SQL regexp_substr非捕获/可选组

问题描述 投票:0回答:2

表达式:

Reassigning definition: (\d+) from: \[(\d+)\] to: \[(\d+)\].+?\.(?: Target definition = (\d+))?.*

正确产生以下匹配项:

Group 1.    24-30   494801
Group 2.    38-45   8280955
Group 3.    52-59   8336297
Group 4.    103-109 494767

对于输入字符串:

Reassigning definition: 494801 from: [8280955] to: [8336297], advancing due dates. Target definition = 494767.

以及输入字符串的前3个匹配项:

Reassigning definition: 494801 from: [8280955] to: [8336297], advancing due dates.

具有JavaScript,Python,PHP和GoLang风格(请参见https://regex101.com/r/Br66wm/3),但不具有SQL regexp-substr:

with
  input_string as
  (
    select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
    union all
    select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
   ),
   pattern_string as
   (
     select 'Reassigning definition: (\d+) from: \[(\d+)\] to: \[(\d+)\].+?\.(?: Target definition = (\d+))?.*$' as pattern_string from dual
   )
select
  regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 1) as group_1,
  regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 2) as group_2,
  regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 3) as group_3,
  regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 4) as group_4
from
  input_string i, pattern_string p;

第四组始终为null。我使用非捕获组有什么问题?基本上,以下句子在我的输入测试字符串中是可选的:

 Target definition = 494767.
sql regex oracle regex-group
2个回答
1
投票

这有点太多,所以不能在评论中写,所以我将在这里写下。如果没有意义,我将其删除。

如果您一直在这些字符串中寻找digits(与它们周围的内容无关),则可以简化为

SQL> with
  2    input_string as
  3    (
  4      select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
  5      union all
  6      select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
  7     )
  8  select regexp_substr(test_string, '\d+', 1, 1) grp1,
  9         regexp_substr(test_string, '\d+', 1, 2) grp2,
 10         regexp_substr(test_string, '\d+', 1, 3) grp3,
 11         regexp_substr(test_string, '\d+', 1, 4) grp4
 12  from input_string;

GRP1       GRP2       GRP3       GRP4
---------- ---------- ---------- ----------
494801     8280955    8336297    494767
494801     8280955    8336297

SQL>

或者,没有固定数量的groups的选项(尽管布局与您想要的不同):

SQL> with
  2    input_string as
  3    (
  4      select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
  5      union all
  6      select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
  7     )
  8  select column_value grp_rn,
  9         regexp_substr(test_string, '\d+', 1, column_value) grp
 10  from input_String cross join
 11    table(cast(multiset(select level from dual
 12                        connect by level <= regexp_count(test_string, '\d+')
 13                       ) as sys.odcinumberlist));

 GRP_RN GRP
------- ----------
      1 494801
      2 8280955
      3 8336297
      4 494767
      1 494801
      2 8280955
      3 8336297

7 rows selected.

0
投票

由于基于POSIX的正则表达式实现似乎不支持非捕获组,并且regex_substr的捕获组不容易作为单独的列使用,因此我进行了以下讨论,以下内容基本上使用了不同的正则表达式作为可选项组。

with
  input_string as
  (
    select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
    union all
    select 'Reassigning definition: 494767 from: [8336297] to: [8369944], advancing dates.' as test_string from dual
   ),
   pattern_string as
   (
     select 'Reassigning definition: (\d+) from: \[(\d+)\] to: \[(\d+)\]' as pattern_string from dual
   )
select
  regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 1) as group_1,
  regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 2) as group_2,
  regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 3) as group_3,
  regexp_substr(i.test_string, 'Target definition = (\d+)', 1, 1, null, 1) as group_4
from
  input_string i, pattern_string p;
© www.soinside.com 2019 - 2024. All rights reserved.