如何在Tensorflow中计算Spearman相关性

问题描述 投票:3回答:1

Problem

我需要计算Pearson和Spearman相关性,并将其用作张量流中的度量。

对于皮尔逊来说,这是微不足道的:

tf.contrib.metrics.streaming_pearson_correlation(y_pred, y_true)

但对斯皮尔曼来说,我一无所知!

What I tried :

来自this answer

    samples = 1
    predictions_rank = tf.nn.top_k(y_pred, k=samples, sorted=True, name='prediction_rank').indices
    real_rank = tf.nn.top_k(y_true, k=samples, sorted=True, name='real_rank').indices
    rank_diffs = predictions_rank - real_rank
    rank_diffs_squared_sum = tf.reduce_sum(rank_diffs * rank_diffs)
    six = tf.constant(6)
    one = tf.constant(1.0)
    numerator = tf.cast(six * rank_diffs_squared_sum, dtype=tf.float32)
    divider = tf.cast(samples * samples * samples - samples, dtype=tf.float32)
    spearman_batch = one - numerator / divider

但这回归NaN ......


definition of Wikipedia之后:enter image description here

我试过了 :

size = tf.size(y_pred)
indice_of_ranks_pred = tf.nn.top_k(y_pred, k=size)[1]
indice_of_ranks_label = tf.nn.top_k(y_true, k=size)[1]
rank_pred = tf.nn.top_k(-indice_of_ranks_pred, k=size)[1]
rank_label = tf.nn.top_k(-indice_of_ranks_label, k=size)[1]
rank_pred = tf.to_float(rank_pred)
rank_label = tf.to_float(rank_label)
spearman = tf.contrib.metrics.streaming_pearson_correlation(rank_pred, rank_label)

但运行这个我得到以下错误:

tensorflow.python.framework.errors_impl.InvalidArgumentError:输入必须至少有k列。有1,需要32

[[{{node metrics / spearman / TopKV2}} = TopKV2 [T = DT_FLOAT,sorted = true,_device =“/ job:localhost / replica:0 / task:0 / device:CPU:0”](lambda_1 / add ,metrics / pearson / pearson_r / variance_predictions / Size)]]

python python-3.x tensorflow metrics
1个回答
0
投票

您可以做的一件事是使用Tensorflow的函数tf.py_functionscipy.stats.spearmanr一起使用并定义输入和输出,如下所示:

from scipy.stats import spearmanr
def get_spearman_rankcor(y_true, y_pred):
     return ( tf.py_function(spearmanr, [tf.cast(y_pred, tf.float32), 
                       tf.cast(y_true, tf.float32)], Tout = tf.float32) )
© www.soinside.com 2019 - 2024. All rights reserved.