我试图实现与tensorflow堆积的自动编码。我用MNIST数据集,并尝试做的尺寸减少784 2.我已经与keras做到了,其结果是好的(火车错误几乎是0.04)。然而,这种tesorflow代码的结果是不好的(火车错误几乎是0.4)。我不知道为什么这些结果是如此不同。你能告诉我我怎么修改呢?

from keras.datasets import mnist
import tensorflow as tf

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

x_train = x_train.reshape(60000,28*28)
x_test = x_test.reshape(10000,28*28)

x_train  = x_train/255.0
x_test = x_test/255.0

x_ = tf.placeholder('float32', [None,784])

w1 = tf.Variable(tf.random_normal([784,512]))
b1 = tf.Variable(tf.random_normal([512]))
h1 = tf.nn.relu(tf.matmul(x_, w1) + b1)

w2 = tf.Variable(tf.random_normal([512,128]))
b2 = tf.Variable(tf.random_normal([128]))
h2 = tf.nn.relu(tf.matmul(h1, w2) + b2)

w0 = tf.Variable(tf.random_normal([128,2]))
b0 = tf.Variable(tf.random_normal([2]))
h0 = tf.matmul(h2, w0) + b0

w_1 = tf.Variable(tf.random_normal([2,128]))
b_1 = tf.Variable(tf.random_normal([128]))
h_1 = tf.nn.relu(tf.matmul(h0, w_1) + b_1)

w_2 = tf.Variable(tf.random_normal([128,512]))
b_2 = tf.Variable(tf.random_normal([512]))
h_2 = tf.nn.relu(tf.matmul(h_1, w_2) + b_2)

w_0 = tf.Variable(tf.random_normal([512,784]))
b_0 = tf.Variable(tf.random_normal([784]))
h_0 = tf.nn.sigmoid(tf.matmul(h_2, w_0) + b_0)

cost = tf.reduce_mean(tf.pow(x_-h_0,2))
train_step = tf.train.AdamOptimizer(0.05).minimize(cost)
init = tf.global_variables_initializer()

batch_count = int(x_train.shape[0]/256)

sess = tf.Session()
for i in range(2) :
    total_cost = 0
    for j in range(batch_count) :
        batch_xs= x_train[i*256:i*256 + 256,:]
        _, cost_val = sess.run([train_step, cost], feed_dict = {x_ : batch_xs})
        total_cost = total_cost + cost_val
        if j % 20 == 0 :
            print('epoch : %d ,%d / %d , loss : %f , average_loss : %f' %(i+1, j+1, batch_count, cost_val, total_cost/(j+1)))

hidden1, hidden2, encoder = sess.run([h1,h2,h0],feed_dict={x_ : x_test})
最终编码步骤具有无激活h0 = tf.matmul(h2, w0) + b0并且因此损失保持在0.48,施加RELU活化那里,即更换后

h0 = tf.matmul(h2, w0) + b0h0 = tf.nn.relu(tf.matmul(h2, w0) + b0),损失下降到0.06,在短短两个时代。

After adding activation in the last layer of the encoder



