在 Kaggle 笔记本上运行拥抱脸部模型时出错

Question

我在 Kaggle 笔记本上使用 Huggingface 库中的 Llama3 模型，在运行管道模块时遇到此错误我已经删除了堆栈跟踪的主要部分，因为否则不允许使用所有代码且没有描述来发布问题。

RuntimeError                              Traceback (most recent call last)
Cell In[19], line 17, in Llama_Chat(system_role, user_msg)
     12 def Llama_Chat(system_role,user_msg):
     13   messages = [
     14     {"role": "system", "content": system_role},
     15     {"role": "user", "content": user_msg},
     16   ]
---> 17   outputs = pipeline(
     18       messages,
     19       max_new_tokens=256,
     20       temperature = 0.1
     21 
     22   )
     24   reply=outputs[0]["generated_text"][-1]["content"]
     25   return reply

File /opt/conda/lib/python3.10/site-packages/accelerate/hooks.py:169, in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
    167         output = module._old_forward(*args, **kwargs)
    168 else:
--> 169     output = module._old_forward(*args, **kwargs)
    170 return module._hf_hook.post_forward(module, output)

File /opt/conda/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py:603, in LlamaSdpaAttention.forward(self, hidden_states, attention_mask, position_ids, past_key_value, output_attentions, use_cache, cache_position, position_embeddings, **kwargs)
    599 # We dispatch to SDPA's Flash Attention or Efficient kernels via this `is_causal` if statement instead of an inline conditional assignment
    600 # in SDPA to support both torch.compile's dynamic shapes and full graph options. An inline conditional prevents dynamic shapes from compiling.
    601 is_causal = True if causal_mask is None and q_len > 1 else False
--> 603 attn_output = torch.nn.functional.scaled_dot_product_attention(
    604     query_states,
    605     key_states,
    606     value_states,
    607     attn_mask=causal_mask,
    608     dropout_p=self.attention_dropout if self.training else 0.0,
    609     is_causal=is_causal,
    610 )
    612 attn_output = attn_output.transpose(1, 2).contiguous()
    613 attn_output = attn_output.view(bsz, q_len, -1)

RuntimeError: cutlassF: no kernel found to launch!

这是我在使用kaggle中的transformers库运行huggingface模型时遇到的错误..我已经检查了cuda、pytorch的版本，它们都很好 ChatGpt、Claude 等都建议版本不匹配，但我没有取得任何进展

Answer 1

尝试将以下后端设置为 False。

torch.backends.cuda.enable_mem_efficient_sdp(False)
torch.backends.cuda.enable_flash_sdp(False)

来源

在 Kaggle 笔记本上运行拥抱脸部模型时出错

问题描述投票：0回答：1

1个回答

最新问题

在 Kaggle 笔记本上运行拥抱脸部模型时出错

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1