正则表达式
"4 to 5"
(以下尝试一个)显然是错误的,因为它错误地匹配了日期。
通过负面的lookahead表情,我想到了正则the the Regex
"2024-12-26"
(下面尝试两个),在拒绝(\d+)(\-)(\d+)
(?!\d+\-\d+\-)(\d+)(\-)(\d+)
时,它正确匹配"4-5"
。
,但是,
"2024-12-26"
在attempt_two
上下文中无法正确地行为,因为尽管它拒绝了
re.subn()
,但搜索仍在继续匹配(并替换)子字符串"2024-12-26"
::
"12-26"
输出:
import re
text = """
2024-12-26
4-5
78-79
"""
attempt_one = re.compile(r"(\d+)(\-)(\d+)")
attempt_two = re.compile(r"(?!\d+\-\d+\-)(\d+)(\-)(\d+)")
print("Attempt one:")
print(re.match(attempt_one, "4-5")) # Match: OK
print(re.match(attempt_one, "2024-12-26")) # Match: False positive
new_text, _ = re.subn(attempt_one, r"\1 to \3", text) # Incorrect substitution
print(new_text)
print("Attempt two:")
print(re.match(attempt_two, "4-5")) # Match: OK
print(re.match(attempt_two, "2024-12-26")) # Doesn't match: OK
new_text, _ = re.subn(attempt_two, r"\1 to \3", text) # Still incorrect
print(new_text)
我可以使用什么正则表达式,以便替换返回以下内容?
Attempt one:
<re.Match object; span=(0, 3), match='4-5'>
<re.Match object; span=(0, 7), match='2024-12'>
2024 to 12-26
4 to 5
78 to 79
Attempt two:
<re.Match object; span=(0, 3), match='4-5'>
None
2024-12 to 26
4 to 5
78 to 79
(因为我的目标是了解正则表达式,所以我对匹配Whitespace或Newline之类的解决方法不感兴趣。)
您需要负面的外观和负lookahead,以禁止比赛前或之后额外的连字符。
2024-12-26
4 to 5
78 to 79
loughounds还必须匹配数字,因此它与日期的一部分不匹配,例如"12-26"
(?<![-\d])(\d+)-(\d+)(?![-\d])