我知道我可以通过简单地让方法返回来使用方法链
self
,例如
object.routine1().routine2().routine3()
但是在应用方法链时是否可以将方法组织成层或组?例如
object.Layer1.routine1().routine2().Layer2.routine3()
上下文是我正在尝试构建一个文本分析管道,不同的层将对应于文本级别、句子级别和标记级别预处理步骤:
text = "This is an example foo text with some special characters!!!! And some sentences"
pr = TextPreprocessor(text)
processed_text = (
pr.text_level.lower_case()
.sentence_level.split_sentences().remove_special_characters()
.token_level.tokenize()
.text
)
这是几乎(!)使文本处理示例正常工作的代码:
import re
class TextLevelPreprocessor:
def __init__(self, parent):
self.parent = parent
def lower_case(self):
self.parent.text = self.parent.text.lower()
return self.parent
class SentenceLevelPreprocessor:
def __init__(self, parent):
self.parent = parent
def split_sentences(self):
self.parent.text = self.parent.text.split('. ')
return self.parent
def remove_special_characters(self):
self.parent.text = [re.sub('[!@#$]', '', s) for s in self.parent.text]
return self.parent
class TokenLevelPreprocessor:
def __init__(self, parent):
self.parent = parent
def tokenize(self):
self.parent.text = [t.split() for t in self.parent.text]
return self.parent
class TextPreprocessor:
def __init__(self, text):
self.text = text
self.text_level = TextLevelPreprocessor(self)
self.sentence_level = SentenceLevelPreprocessor(self)
self.token_level = TokenLevelPreprocessor(self)
但是这里只有这个语法才有效
pr = TextPreprocessor(text)
processed_text = (
pr.text_level.lower_case()
.sentence_level.split_sentences().
.sentence_level.remove_special_characters()
.token_level.tokenize()
.text
)
这意味着每次使用方法时都必须添加“层”或“组”,这看起来很冗长。
为了按类别组织方法,我会将它们放在单独的 mixin 类中,并使主类继承自 mixin 类。由于所有方法都在同一个实例上运行,因此不需要“父”对象。相反,mixin 类可以从初始化实例所有属性的基类继承:
class TextPreprocessorBase:
def __init__(self, text):
self.text = text
class TextLevelPreprocessor(TextPreprocessorBase):
def lower_case(self):
self.text = self.text.lower()
return self
class SentenceLevelPreprocessor(TextPreprocessorBase):
def split_sentences(self):
self.text = self.text.split('. ')
return self
def remove_special_characters(self):
self.text = [re.sub('[!@#$]', '', s) for s in self.text]
return self
class TokenLevelPreprocessor(TextPreprocessorBase):
def tokenize(self):
self.text = [t.split() for t in self.parent.text]
return self
class TextPreprocessor(TextLevelPreprocessor, SentenceLevelPreprocessor, TokenLevelPreprocessor):
pass