重采样数据在应用 SMOTE 后不显示目标类的任何值

问题描述 投票:0回答:1

我是 ML 专家,我正在尝试在 PIDD 数据集上实施 SMOTE 以预测糖尿病。

from sklearn.model_selection import train_test_split
from imblearn.over_sampling import SMOTE
#os =  SMOTE()
X = exTrans.drop(['Outcome'], axis=1)
y = exTrans['Outcome']
sm = SMOTE(random_state=42, k_neighbors=5)
X_smote, y_smote = sm.fit_resample(X, y)

### This resamples the dataset for SMOTE technique
X_smote.shape,y_smote.shape

在上面的代码中,重采样数据后我得到以下输出:

((1000, 8), (1000,))

输出的第一部分是

X_smote.shape
,没问题;但是
y_smote.shape
输出不完整它只显示 (1000, ) 并且缺少第二个参数。 我觉得我错过了一些东西,它应该有什么价值吗?如果是,我该如何实现?

python machine-learning deep-learning artificial-intelligence smote
1个回答
0
投票

使用以下代码测试您的数据:

# SMOTE for imbalanced dataset
from collections import Counter
from sklearn.datasets import make_classification
from imblearn.over_sampling import SMOTE
from matplotlib import pyplot
from numpy import where
# define dataset
X, y = make_classification(n_samples=10000, n_features=2, n_redundant=0,
 n_clusters_per_class=1, weights=[0.99], flip_y=0, random_state=1)
# summarize class distribution
counter = Counter(y)
print(counter)
# transform the dataset
oversample = SMOTE()
X_sm, y_sm = oversample.fit_resample(X, y)
# summarize the new class distribution
counter = Counter(y_sm)
print(counter)
© www.soinside.com 2019 - 2024. All rights reserved.