我有一个字典,字符串将被替换为keys
并将其替换为值。除了通过令牌查看字符串令牌之外,还有更好/更快的替换方法吗?
我一直在这样做:
segmenter = {'foobar':'foo bar', 'withoutspace':'without space', 'barbar': 'bar bar'}
sentence = "this is a foobar in a barbar withoutspace"
for i in sentence.split():
if i in segmenter:
sentence.replace(i, segmenter[i])
字符串在python中是不可变的。因此,str.replace
返回一个新字符串,而不是修改原始字符串。你可以在这里使用str.join()
和list comprehension:
>>> segmenter = {'foobar':'foo bar', 'withoutspace':'without space', 'barbar': 'bar bar'}
>>> sentence = "this is a foobar in a barbar withoutspace"
>>> " ".join( [ segmenter.get(word,word) for word in sentence.split()] )
'this is a foo bar in a bar bar without space'
str.replace
的另一个问题是它也会替换像"abarbarb"
这样的词
"abar barb"
。
re.sub
可以调用返回替换的函数
segmenter = {'foobar':'foo bar', 'withoutspace':'without space', 'barbar': 'bar bar'}
sentence = "this is a foobar in a barbar withoutspace"
import re
def fn(match):
return segmenter[match.group()]
print re.sub('|'.join(re.escape(k) for k in segmenter), fn, sentence)