nlpaug version: 1.1.11
Mac OS
我正在尝试使用
nlpaug
库进行数据扩充https://github.com/makcedward/nlpaug
反向翻译给我以下错误:
import nlpaug.augmenter.word as naw
text = 'The quick brown fox jumped over the lazy dog'
back_translation_aug = naw.BackTranslationAug(from_model_name='facebook/wmt19-en-ru', to_model_name='facebook/wmt19-ru-en',batch_size=2,force_reload=True)
print(back_translation_aug.augment(text))
错误是
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/nlpaug/base_augmenter.py", line 98, in augment
result = action_fx(clean_data)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/nlpaug/augmenter/word/back_translation.py", line 70, in substitute
augmented_text = self.model.predict(data)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/nlpaug/model/lang_models/machine_translation_transformers.py", line 39, in predict
src_translated_texts = self.translate_one_step_batched(texts, self.src_tokenizer, self.src_model)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/nlpaug/model/lang_models/machine_translation_transformers.py", line 58, in translate_one_step_batched
for batch in tokenized_dataloader:
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
data = self._next_data()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1186, in _next_data
idx, data = self._get_data()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1152, in _get_data
success, data = self._try_get_data()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1003, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 3954) exited unexpectedly
BackTranslationAug.augment
函数需要一个字符串列表作为输入。试试这个:
from nlpaug.augmenter.word import BackTranslationAug
import nlpaug.augmenter.word as naw
text = ['The quick brown fox jumped over the lazy dog', 'hello world']
back_translation_aug = naw.BackTranslationAug(
from_model_name='facebook/wmt19-en-ru',
to_model_name='facebook/wmt19-ru-en',
batch_size=2,
force_reload=True)
print(back_translation_aug.augment(text))
[出]:
['Fast brown fox leaps over lazy dog', 'Hello world']