表达式:
Reassigning definition: (\d+) from: \[(\d+)\] to: \[(\d+)\].+?\.(?: Target definition = (\d+))?.*
正确产生以下匹配项:
Group 1. 24-30 494801
Group 2. 38-45 8280955
Group 3. 52-59 8336297
Group 4. 103-109 494767
对于输入字符串:
Reassigning definition: 494801 from: [8280955] to: [8336297], advancing due dates. Target definition = 494767.
以及输入字符串的前3个匹配项:
Reassigning definition: 494801 from: [8280955] to: [8336297], advancing due dates.
具有JavaScript,Python,PHP和GoLang风格(请参见https://regex101.com/r/Br66wm/3),但不具有SQL regexp-substr:
with
input_string as
(
select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
union all
select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
),
pattern_string as
(
select 'Reassigning definition: (\d+) from: \[(\d+)\] to: \[(\d+)\].+?\.(?: Target definition = (\d+))?.*$' as pattern_string from dual
)
select
regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 1) as group_1,
regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 2) as group_2,
regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 3) as group_3,
regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 4) as group_4
from
input_string i, pattern_string p;
第四组始终为null
。我使用非捕获组有什么问题?基本上,以下句子在我的输入测试字符串中是可选的:
Target definition = 494767.
这有点太多,所以不能在评论中写,所以我将在这里写下。如果没有意义,我将其删除。
如果您一直在这些字符串中寻找digits(与它们周围的内容无关),则可以简化为
SQL> with
2 input_string as
3 (
4 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
5 union all
6 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
7 )
8 select regexp_substr(test_string, '\d+', 1, 1) grp1,
9 regexp_substr(test_string, '\d+', 1, 2) grp2,
10 regexp_substr(test_string, '\d+', 1, 3) grp3,
11 regexp_substr(test_string, '\d+', 1, 4) grp4
12 from input_string;
GRP1 GRP2 GRP3 GRP4
---------- ---------- ---------- ----------
494801 8280955 8336297 494767
494801 8280955 8336297
SQL>
或者,没有固定数量的groups的选项(尽管布局与您想要的不同):
SQL> with
2 input_string as
3 (
4 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
5 union all
6 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
7 )
8 select column_value grp_rn,
9 regexp_substr(test_string, '\d+', 1, column_value) grp
10 from input_String cross join
11 table(cast(multiset(select level from dual
12 connect by level <= regexp_count(test_string, '\d+')
13 ) as sys.odcinumberlist));
GRP_RN GRP
------- ----------
1 494801
2 8280955
3 8336297
4 494767
1 494801
2 8280955
3 8336297
7 rows selected.
由于基于POSIX的正则表达式实现似乎不支持非捕获组,并且regex_substr
的捕获组不容易作为单独的列使用,因此我进行了以下讨论,以下内容基本上使用了不同的正则表达式作为可选项组。
with
input_string as
(
select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
union all
select 'Reassigning definition: 494767 from: [8336297] to: [8369944], advancing dates.' as test_string from dual
),
pattern_string as
(
select 'Reassigning definition: (\d+) from: \[(\d+)\] to: \[(\d+)\]' as pattern_string from dual
)
select
regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 1) as group_1,
regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 2) as group_2,
regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 3) as group_3,
regexp_substr(i.test_string, 'Target definition = (\d+)', 1, 1, null, 1) as group_4
from
input_string i, pattern_string p;