nlpaug BackTranslationAug 抛出 RuntimeError:DataLoader worker (pid(s) 3954) 意外退出

问题描述 投票:0回答:1
nlpaug version: 1.1.11

Mac OS

我正在尝试使用

nlpaug
库进行数据扩充https://github.com/makcedward/nlpaug

反向翻译给我以下错误:

import nlpaug.augmenter.word as naw

text = 'The quick brown fox jumped over the lazy dog'
back_translation_aug = naw.BackTranslationAug(from_model_name='facebook/wmt19-en-ru', to_model_name='facebook/wmt19-ru-en',batch_size=2,force_reload=True)
print(back_translation_aug.augment(text))

错误是

File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/nlpaug/base_augmenter.py", line 98, in augment
    result = action_fx(clean_data)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/nlpaug/augmenter/word/back_translation.py", line 70, in substitute
    augmented_text = self.model.predict(data)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/nlpaug/model/lang_models/machine_translation_transformers.py", line 39, in predict
    src_translated_texts = self.translate_one_step_batched(texts, self.src_tokenizer, self.src_model)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/nlpaug/model/lang_models/machine_translation_transformers.py", line 58, in translate_one_step_batched
    for batch in tokenized_dataloader:
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1186, in _next_data
    idx, data = self._get_data()
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1152, in _get_data
    success, data = self._try_get_data()
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1003, in _try_get_data
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 3954) exited unexpectedly
python-3.x deep-learning nlp machine-translation
1个回答
0
投票

BackTranslationAug.augment
函数需要一个字符串列表作为输入。试试这个:

from nlpaug.augmenter.word import BackTranslationAug


import nlpaug.augmenter.word as naw

text = ['The quick brown fox jumped over the lazy dog', 'hello world']

back_translation_aug = naw.BackTranslationAug(
  from_model_name='facebook/wmt19-en-ru', 
  to_model_name='facebook/wmt19-ru-en',
  batch_size=2,
  force_reload=True)

print(back_translation_aug.augment(text))

[出]:

['Fast brown fox leaps over lazy dog', 'Hello world']
© www.soinside.com 2019 - 2024. All rights reserved.