eval（predvars，data，env）中的错误：找不到对象“适配器”

Question

我正在尝试在 tf-idf 矩阵上训练随机森林分类器，其中列是评论中的单词。

获得一个想法：

label...1   actually    adapter
1         0 0.01495934 0.02880089
2         0 0.00000000 0.00000000
3         0 0.00000000 0.00000000

我使用train_data训练模型，其中标签为[0]为负，[1]为正。

这是代码：

set.seed(123)
random_forest_model <- train(label...1 ~ ., 
               data = train_data, 
               method = "rf", 
               trControl = trainControl(method = "cv", number = 10), 
               tuneGrid = expand.grid(mtry = 100),
               ntree = 500,
               importance = TRUE)

我想使用经过训练的模型来预测另一个矩阵的评论是正面还是负面。

使用此代码：

# Make predictions on the test set
y_pred <- predict(random_forest_model, newdata = test_data)

问题是我收到此错误：

Error in eval(predvars, data, env) : object 'adapter' not found

因为并非train_data中存在的所有单词（列）也存在于test_data中。 test_data 的评论不同。

该模型的想法是预测在这种情况下评论是正面还是负面。不可能找到总是具有相同单词的矩阵。

我尝试输入 RF 模型数据框而不是矩阵，因为我读到它更好，但它没有解决问题。

如何解决这个问题？

Answer 1

您需要删除测试数据中存在但训练数据中不存在的单词。我不知道以下内容是否完全正确，因为我看不到您的数据，但希望它能让您了解如何继续。

# Get the names of the columns that the model was trained on
train_cols <- names(random_forest_model$finalModel$forest$xlevels)

# Subset the test data to only include these columns
test_data_subset <- test_data[, train_cols]

# Replace any NA values that result from subsetting with 0
# (assuming that your TF-IDF matrix doesn't contain any negative values)
test_data_subset[is.na(test_data_subset)] <- 0

# Make predictions on the subsetted test data
y_pred <- predict(random_forest_model, newdata = test_data_subset)

eval（predvars，data，env）中的错误：找不到对象“适配器”

问题描述投票：0回答：1

1个回答

最新问题

eval（predvars，data，env）中的错误：找不到对象“适配器”

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1