这是我正在做的文本分类任务的代码。问题似乎就出在这里。这是一个多类问题。我有3个标签。我尝试了几件事。我将标签的格式更改为整数,并尝试研究损失函数。我不确定需要更改哪些参数。
def evaluate(model, dataloader_val):
model.eval()
model.train(False)
loss_val_total = 0
predictions, true_vals = [], []
for batch in dataloader_val:
batch = tuple(b.to(device) for b in batch)
inputs = {'input_ids': batch[0],
'attention_mask': batch[1],
'labels': batch[2],
}
with torch.no_grad():
outputs = model(**inputs)
loss = outputs[0]
logits = outputs[1]
loss_val_total += loss.item()
probs = torch.argmax(logits, dim = 1).detach().cpu().numpy()
label_ids = inputs['labels'].cpu().numpy()
predictions.append(probs)
true_vals.append(label_ids)
loss_val_avg = loss_val_total/len(dataloader_val)
predictions = np.concatenate(predictions, axis=0)
true_vals = np.concatenate(true_vals, axis=0)
### after evaluating we resume model training
model.train(True)
return loss_val_avg, predictions, true_vals
这是我得到的错误
ValueError Traceback (most recent call last)
<ipython-input-55-a095c6ad8f10> in <module>
44 }
45
---> 46 outputs = model(**inputs)
ValueError: Expected input batch_size (2) to match target batch_size (4).
我不确定你是否故意这样做,但将标签作为模型的输入
outputs = model(**inputs)
似乎对我来说是个问题。您应该从字典 labels
中删除 inputs
。此外,如果您的模型是BertForSequenceClassification
,那么它不会接受标签作为输入,也不会自动计算损失函数。尝试将您的代码更改为
def evaluate(model, dataloader_val):
model.eval()
model.train(False)
loss_val_total = 0
predictions, true_vals = [], []
for batch in dataloader_val:
batch = tuple(b.to(device) for b in batch)
inputs = {
'input_ids': batch[0],
'attention_mask': batch[1]
}
labels = batch[2], # your labels should be given in dataloader as one-hot encoded values
with torch.no_grad():
outputs = model(**inputs)
loss = nn.CrossEntropyLoss()(outputs, labels)
loss_val_total += loss.item()
probs = torch.argmax(outputs, dim = 1).detach().cpu().numpy()
label_ids = torch.argmax(labels, dim = 1).cpu().numpy()
predictions.append(probs)
true_vals.append(label_ids)
loss_val_avg = loss_val_total/len(dataloader_val)
predictions = np.concatenate(predictions, axis=0)
true_vals = np.concatenate(true_vals, axis=0)
### after evaluating we resume model training
model.train(True)
return loss_val_avg, predictions, true_vals