我试图用 peft qLora 训练来训练模型。 Lora 配置和 peft 训练参数如下所示:
lora_config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=[
"q_proj",
"k_proj",
"v_proj",
"o_proj",
"gate_proj",
"up_proj",
"down_proj",
"lm_head",
],
bias="none",
lora_dropout=0.05, # Conventional
task_type="CAUSAL_LM",
)
peft_model = get_peft_model(original_model,
lora_config)
output_dir = f'./peft-bn-mistral-training-{str(int(time.time()))}'
peft_training_args = TrainingArguments(
output_dir=output_dir,
auto_find_batch_size=True,
learning_rate=1e-3, # Higher learning rate than full fine-tuning.
num_train_epochs=1,
logging_steps=1,
max_steps=1
)
device = torch.device("cuda:0")
peft_trainer = Trainer(
model=peft_model.to(device),
args=peft_training_args,
train_dataset=tokenized_datasets["train"],
)
peft_trainer.train()
代码产生如下错误:
TypeError Traceback (most recent call last)
<ipython-input-46-b47531775ae7> in <cell line: 1>()
----> 1 peft_trainer.train()
2
3 peft_model_path="./peft-bn-mistral-checkpoint-local"
4
5 peft_trainer.model.save_pretrained(peft_model_path)
1326 current_device_index = current_device.index if isinstance(current_device, torch.device) else current_device
1327
-> 1328 if torch.device(current_device_index) != self.device:
1329 # if on the first device (GPU 0) we don't care
1330 if (self.device.index is not None) or (current_device_index != 0):
TypeError: Device() received an invalid combination of arguments - got (NoneType), but expected one of:
* (torch.device device)
didn't match because some of the arguments have invalid types: (!NoneType!)
* (str type, int index)
我尝试调整差异设置,将设备参数引入模型,但它始终导致上述错误。 请注意,我使用
BitsAndBytesConfig
中的 transformers
模块作为标记器。
TIA
为了解决这个问题,我必须在传递参数之前将模型分配给 GPU。
之前:
peft_trainer = Trainer(
model=peft_model.to(device),
args=peft_training_args,
train_dataset=tokenized_datasets["train"],
)
之后:
peft_model= peft_model.to(device)
peft_trainer = Trainer(
model=peft_model,
args=peft_training_args,
train_dataset=tokenized_datasets["train"],
)