无法pickle本地对象'get_tokenizer。<locals>.<lambda>'

问题描述 投票:0回答:1
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
d:\abie\Coding\Tugas Akhir brow\VisualGPT-main\VisualGPT-main\VisualGPT.ipynb Cell 1 line 3
    296 dict_dataloader_test = DataLoader(dict_dataset_test, batch_size=args.batch_size // 5)
    299 if not use_rl:
--> 300     train_loss = train_xe(model, dataloader_train, text_field,gpt_optimizer,dataloader_val,args)

d:\abie\Coding\Tugas Akhir brow\VisualGPT-main\VisualGPT-main\VisualGPT.ipynb Cell 1 line 8
     84 running_loss = .0
     85 with tqdm(desc='Epoch %d - train' % e, unit='it', total=len(dataloader)) as pbar:
---> 86     for it, (detections, captions) in enumerate(dataloader):
     88         detections, captions = detections.to(device), captions.to(device)
     91         out,past= model(detections, captions)

File c:\Users\Axioo Pongo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py:438, in DataLoader.__iter__(self)
    436     return self._iterator
    437 else:
--> 438     return self._get_iterator()

File c:\Users\Axioo Pongo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py:386, in DataLoader._get_iterator(self)
    384 else:
    385     self.check_worker_number_rationality()
--> 386     return _MultiProcessingDataLoaderIter(self)

File c:\Users\Axioo Pongo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py:1039, in _MultiProcessingDataLoaderIter.__init__(self, loader)
...
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)

AttributeError: Can't pickle local object 'get_tokenizer.<locals>.<lambda>'
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings

如何解决这个问题?我创建了一个 get_tokenizer 函数,我想在另一个 .py 文件中调用它,但这个问题发生在这里

python pytorch pickle torch
1个回答
0
投票
def get_tokenizer(tokenizer):
    if callable(tokenizer):
        return tokenizer
    if tokenizer == "spacy":
        try:
            import spacy
            spacy_en = spacy.load('en_core_web_sm')
            print("now we are loading the spacy tokenizer")

            return lambda s: [tok.text for tok in spacy_en.tokenizer(s)]
        except ImportError:
            print("Please install SpaCy and the SpaCy English tokenizer. "
                  "See the docs at https://spacy.io for more information.")
            raise
        except AttributeError:
            print("Please install SpaCy and the SpaCy English tokenizer. "
                  "See the docs at https://spacy.io for more information.")
            raise
    elif tokenizer == "moses":
        try:
            from nltk.tokenize.moses import MosesTokenizer
            moses_tokenizer = MosesTokenizer()
            return moses_tokenizer.tokenize
        except ImportError:
            print("Please install NLTK. "
                  "See the docs at http://nltk.org for more information.")
            raise
        except LookupError:
            print("Please install the necessary NLTK corpora. "
                  "See the docs at http://nltk.org for more information.")
            raise
    elif tokenizer == 'revtok':
        try:
            import revtok
            return revtok.tokenize
        except ImportError:
            print("Please install revtok.")
            raise
    elif tokenizer == 'subword':
        try:
            import revtok
            return lambda x: revtok.tokenize(x, decap=True)
        except ImportError:
            print("Please install revtok.")
            raise

    raise ValueError("Requested tokenizer {}, valid choices are a "
                     "callable that takes a single string as input, "
                     "\"revtok\" for the revtok reversible tokenizer, "
                     "\"subword\" for the revtok caps-aware tokenizer, "
                     "\"spacy\" for the SpaCy English tokenizer, or "
                     "\"moses\" for the NLTK port of the Moses tokenization "
                     "script.".format(tokenizer))

这就是功能

© www.soinside.com 2019 - 2024. All rights reserved.