在目标函数本身中使用softmax层

Question

我有一个类似 CNN 的常规网络，其顶部有标准 MLP 层。在 MLP 之上，我也有一个 softmax 层，但是，与传统网络不同，它没有完全连接到下面的 MLP，并且它由子组组成。

进一步描述softmax，它看起来像这样：

Neur1A Neur2A ... NeurNA      Neur1B Neur2B ... NeurNB   Neur1C Neur2C ...NeurNC
        Group A                           Group B                Group C

还有更多的团体。每个组都有一个独立于其他组的 softmax。所以在某种程度上，它是几个独立的分类（尽管实际上不是）。

我需要的是激活神经元的索引在组之间单调递增。例如，如果我激活了 A 组中的 Neuron5，我希望 B 组中激活的神经元 >=5。 B组和C组也是如此..

这个包含所有组的所有神经元的 softmax 层实际上不是我的最后一层，有趣的是它是中间层。

为了实现这种单调性，我在损失函数中添加了另一个项，以惩罚非单调激活的神经元指数。这是一些代码：

softmax层的代码及其输出：

def compute_image_estimate(layer2_input):
    estimated_yps= tf.zeros([FLAGS.batch_size,0],dtype=tf.int64)
    for pix in xrange(NUM_CLASSES):
        pixrow= int( pix/width)
        rowdata= image_pixels[:,  pixrow*width:(pixrow+1)*width]
    
        with tf.variable_scope('layer2_'+'_'+str(pix)) as scope:
            weights = _variable_with_weight_decay('weights', shape=[layer2_input.get_shape()[1], width],   stddev=0.04, wd=0.0000000)
            biases = _variable_on_cpu('biases', [width], tf.constant_initializer(0.1))
            y = tf.nn.softmax(tf.matmul(layer2_input,weights) + biases)
            argyp=width-1-tf.argmax(y,1)
            argyp= tf.reshape(argyp,[FLAGS.batch_size,1])
        estimated_yps=tf.concat(1,[estimated_yps,argyp])

        return estimated_yps

estimated_yps 被传递给量化单调性的函数：

def compute_monotonicity(yp):
    sm= tf.zeros([FLAGS.batch_size])

    for curr_row in xrange(height):
        for curr_col in xrange(width-1):
            pix= curr_row *width + curr_col
            sm=sm+alpha * tf.to_float(tf.square(tf.minimum(0,tf.to_int32(yp[:,pix]-yp[:,pix+1]))))

    return sm

损失函数为：

def loss(estimated_yp, SOME_OTHER_THINGS):
    tf.add_to_collection('losses', SOME_OTHER_THINGS)

    monotonicity_metric= tf.reduce_mean( compute_monotonocity(estimated_yp) )
    tf.add_to_collection('losses', monotonicity_metric)
    return tf.add_n(tf.get_collection('losses'), name='total_loss')

现在我的问题是，当我不使用传统指标中的 SOME_OTHER_THINGS 时，我会得到

ValueError: No gradients provided for any variable

作为单调性指标。

当像这样使用softmax层输出时，似乎没有定义梯度。

我做错了什么吗？

Answer 1

抱歉..我意识到问题是 tf.argmax 函数显然没有定义梯度。

在目标函数本身中使用softmax层

问题描述投票：0回答：1

1个回答

最新问题

在目标函数本身中使用softmax层

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1