!pip install rouge_score
from datasets import load_metric
metric= load_metric("rouge")
pred_str =['السلام عليكم كيف حالك']
label_str=['السلام عليكم صديقي كيف حالك']
metric.add_batch(predictions=pred_str, references=label_str)
metric.compute()
输出:
{‘rouge1’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0)),
‘rouge2’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0)),
‘rougeL’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0)),
‘rougeLsum’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0))}
rouge_scorer
包并添加支持阿拉伯语的分词器。另外,一定不要使用蒸锅。代码如下:
from rouge_score import rouge_scorer
r_scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'],
tokenizer=tokenizer)
model_name = 'arabert'
tokenizer = AutoTokenizer.from_pretrained(model_name) #huggingface model
pred_str ='السلام عليكم كيف حالك'
label_str='السلام عليكم صديقي كيف حالك'
ROU = r_scorer.score(label_str, pred_str)