PyG 图自动编码器损失被冻结,可能存在数据对象组装问题

问题描述 投票:0回答:1

我尝试在自定义 PyG 数据对象上使用图形自动编码器,但是当我尝试训练它时,损失、AUC 和 AP 不会改变。使用 PyTorch Geometric 的示例数据对象时,完全相同的自动编码器可以工作,所以我认为我在创建自定义数据对象的过程中在某个地方犯了错误。我使用了来自GeoffBoeing的街道网络节点/边缘列表的数据,特别是本示例中来自阿伯丁的数据。以下是从节点和边缘 csv(已转换为 dfs 节点_ab 和边缘_ab)中制作由节点特征(xy 坐标)和边缘索引(源节点和目标节点)组成的数据对象的过程。

# Creating node feature tensors
node_features = nodes_ab[['x', 'y']].values
node_features = torch.tensor(node_features, dtype=torch.float)

# Creating edge index
edge_index = edges_ab[['source', 'dest']].values.T
edge_index = torch.tensor(edge_index, dtype=torch.long)

# Create data object
data = Data(x=node_features, edge_index=edge_index)

# Split data
transform = T.RandomLinkSplit(num_val=0.05,
                              num_test=0.1,
                              is_undirected=True,
                              add_negative_train_samples=True)
train_data, val_data, test_data = transform(data)

# Extract positive and negative edges for train, validation, and test sets
def get_pos_neg_edges(data):
    pos_edge_index = data.edge_label_index[:, data.edge_label == 1]
    neg_edge_index = data.edge_label_index[:, data.edge_label == 0]
    return pos_edge_index, neg_edge_index

train_pos_edge_index, train_neg_edge_index = get_pos_neg_edges(train_data)
val_pos_edge_index, val_neg_edge_index = get_pos_neg_edges(val_data)
test_pos_edge_index, test_neg_edge_index = get_pos_neg_edges(test_data)

# Add these to the data object
data.train_pos_edge_index = train_pos_edge_index
data.train_neg_edge_index = train_neg_edge_index
data.val_pos_edge_index = val_pos_edge_index
data.val_neg_edge_index = val_neg_edge_index
data.test_pos_edge_index = test_pos_edge_index
data.test_neg_edge_index = test_neg_edge_index


# Create encoder and autoencoder
class GCNEncoder(torch.nn.Module):
    def __init__(self, in_channels, out_channels):
        super(GCNEncoder, self).__init__()
        self.conv1 = GCNConv(in_channels, 2 * out_channels, cached=True) # cached only for transductive learning
        self.conv2 = GCNConv(2 * out_channels, out_channels, cached=True) # cached only for transductive learning

    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index).relu()
        return self.conv2(x, edge_index)

# parameters
out_channels = 2
num_features = data.num_features
epochs = 100

# model
model = GAE(GCNEncoder(num_features, out_channels))

# move to GPU (if available)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
x = data.x.to(device)
train_pos_edge_index = data.train_pos_edge_index.to(device)

# inizialize the optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.03)

def train():
    model.train()
    optimizer.zero_grad()
    z = model.encode(x, train_pos_edge_index)
    loss = model.recon_loss(z, train_pos_edge_index)
    loss.backward()
    optimizer.step()
    print(f"Training loss: {loss.item()}")
    return float(loss)

def test(pos_edge_index, neg_edge_index):
    model.eval()
    with torch.no_grad():
        z = model.encode(x, train_pos_edge_index)
    auc, ap = model.test(z, pos_edge_index, neg_edge_index)
    return auc, ap

# Train the model
for epoch in range(1, epochs + 1):
    loss = train()

    auc, ap = test(data.test_pos_edge_index, data.test_neg_edge_index)
    print('Epoch: {:03d}, AUC: {:.4f}, AP: {:.4f}\n _________________________'.format(epoch, auc, ap))

对巨大的代码墙感到抱歉!我不知道问题发生在哪里,所以想包括所有内容。这是运行此命令的输出。

训练损失:34.538780212402344 纪元:001,AUC:0.5000,AP:0.5000


训练损失:34.538780212402344 纪元:002,AUC:0.5000,AP:0.5000


训练损失:34.538780212402344 纪元:003,AUC:0.5000,AP:0.5000

等等,100 个时期。

我尝试合并节点标签(y),使用节点特征的虚拟变量,使用节点索引作为节点标签,并重新标记索引号。我仍然遇到同样的问题。我发现另外两个人也经历过这种情况here,但他们的问题在合并(非索引)节点功能后得到了解决,这对我来说不起作用。提前致谢,如果我不清楚或需要更多信息,请告诉我 - 这是我第一次在堆栈溢出上发帖。

autoencoder pytorch-geometric data-objects graph-neural-network
1个回答
0
投票

我尝试重现您的问题。我陷入了您引用的没有

source
dest
字段的数据集上。我尝试使用
u
v
,但这导致卷积层出现错误,因为索引超出范围。您能分享一下您是如何预处理数据的吗?

© www.soinside.com 2019 - 2024. All rights reserved.