我有一个 csv 数据集,我想将其分成 5 个部分,以将其分发给联邦学习场景中的 5 个客户端。这是我的代码:
train = train.iloc[:, 1:]
train = train.fillna(0)
train = train.rename(columns=lambda x: x.strip())
num_clients = 5
train_parts = np.array_split(train, num_clients)
train_data = []
for i in range(num_clients):
# Convert the pandas dataframe into a PyTorch tensor
data = torch.tensor(train_parts[i].iloc[:, :-1].values, dtype=torch.float32)
labels = torch.tensor(train_parts[i].iloc[:, -1].values, dtype=torch.float32)
scaler = MinMaxScaler()
data = scaler.fit_transform(data.numpy())
dataset = Dataset(data,labels)
dataloader = DataLoader(dataset, batch_size=64, shuffle=True)
train_data.append(dataloader)
我收到此错误消息:
<ipython-input-10-5b003ffb66f7> in <module>
35 # data = torch.from_numpy(data).float()
36
---> 37 dataset = Dataset(data,labels)
38 dataloader = DataLoader(dataset, batch_size=64, shuffle=True)
39 train_data.append(dataloader)
/usr/lib/python3.8/typing.py in __new__(cls, *args, **kwds)
873 obj = super().__new__(cls)
874 else:
--> 875 obj = super().__new__(cls, *args, **kwds)
876 return obj
877
TypeError: object.__new__() takes exactly one argument (the type to instantiate)
我的错误在哪里?我可以像这样拆分数据集并创建数据加载器吗?或者这是完全错误的?