我想将以下字符串的结束引号与开始引号匹配(如果两者位于同一行)。两个字符串可以用空格
或 blank-plus-blank +
. 分隔。
正则表达式引擎:Python
F.i.来自
this is "some string" "; which should match" 234
"and this" + "should also match\"" "\"and this"
but not this: " " a " + "
我想查看以下匹配项:
" "
,位于 some string
和 ; which...
" + "
位于 and this
和 should also match\"
" "
位于 should also match\"
和 \"and this
所以事实上,我认为最好只匹配组
" "
和" + "
,如果在组之前和之后有奇数个引号。由于lookbehining/ahead只是固定长度,我没有找到一个好的方法来做到这一点。
我试过了
re.compile(r'(" \+ ")|(" ")(?!;|,)')
但这假设字符串中可能没有分号还有
re.compile(r'"[^"]+")
但这只能找到字符串本身,而不能找到“字符串间”引号。
这就是我上面提到的字符循环解析方法。 我跟踪我们是否在引号内,并且我跟踪引号之间的字符。
data = """\
this is "some string" "; which should match" 234
"and this" + "should also match\"" "\"and this"
but not this: " " a " + "
"""
def check(line):
in_quotes = False
between = "xxxx"
for c in line:
if c == '"':
if not in_quotes and between in (' ', ' + '):
return True
between = ""
in_quotes = not in_quotes
elif not in_quotes:
between += c
for line in data.splitlines():
print(line)
if check(line):
print("MATCHES")
输出:
this is "some string" "; which should match" 234
MATCHES
"and this" + "should also match"" ""and this"
MATCHES
but not this: " " a " + "