它仅给出“ AT” AS输出,但也应该提供建筑物。
captures = match.capturesdict()
我无法用来提取其他示例的捕获。
当我这样做时。它似乎在起作用。有人可以解释我做错了什么吗?这里的主要问题是您需要放置
location_pattern = 'at|outside\s\w+
{keywords}
。这是一个示意图示例:
(?:{keywords})
a|b|c\s+\w+
a
+++++(a | b | c)\ s+s+\ w+w+w+w+
b
ac
b<whitespace(s)>
c`,然后它尝试匹配whitespaces,然后匹配字chars.请参阅更新的代码(在线
demo):
. When you put the alternation list into a group,
输出:
, it matches either
注意,
, or
由于
or
到处都不匹配,因此不起作用,并且必须遵循空格和字符。您可以以相同的方式修复它:
import regex as re
def get_location(s):
STRIP_CHARS = '*'
s = s.strip(STRIP_CHARS)
keywords = "at|outside|near"
location_pattern = "(?P<location>((?P<place>(?:{keywords})\s+[A-Za-z]+)))".format(keywords = keywords)
location_regex = re.compile(location_pattern, re.IGNORECASE | re.UNICODE)
for match in location_regex.finditer(s):
match_str = match.group(0)
indices = match.span(0)
print ("Match", match)
match_str = match.group(0)
indices = match.span(0)
print (match_str)
captures = match.capturesdict()
print(captures)
get_location("Im at building 3")
.
如果将关键字放入组中,则('Match', <regex.Match object; span=(3, 14), match='at building'>)
at building
{'place': ['at building'], 'location': ['at building']}
location_pattern = 'at|outside\s\w+