自定义子类 Tensorflow MoE 模型的 call() 函数期间出错

问题描述 投票:0回答:1

我希望实现一个自定义的专家混合模型子类化 keras.Model。 在 call() 函数中,模型从另一个模型(门控模型)获取权重并对它们进行算术运算。特别是我在体重方面遇到困难。

class MoE(keras.Model):
    def __init__(self, gating_model, expert_models):
        super(MoE, self).__init__()
        self.experts = expert_models
        for expert in self.experts:
            expert.trainable = False
        self.gating = gating_model
    def call(self, x):
        outputs = tf.concat([tf.expand_dims(expert_model(tf.expand_dims(x, axis=0)), axis=2) for expert_model in expert_models], axis=2)
        gating_weights = [layer.get_weights() for layer in gating_model.layers]
        gating_weights = tf.reshape(gating_weights, tf.shape(outputs))
        return tf.reduce_sum(outputs * gating_weights, dim=2)
    def get_config(self):
        config = super().get_config()
        config.update({"experts": self.experts, "gating": self.gating_model})
        return config

投掷:

--> 214         gating_weights = [layer.get_weights() for layer in gating_model.layers]
    215         gating_weights = tf.reshape(gating_weights, tf.shape(outputs))

NotImplementedError: Exception encountered when calling MoE.call().

numpy() is only available when eager execution is enabled.

Arguments received by MoE.call():
  • x=tf.Tensor

这不是很不言自明,因为我在任何地方都没有 numpy。如果有人能解决这个问题,我将不胜感激。

python tensorflow keras deep-learning tensorflow2.0
1个回答
0
投票

这是修改后的代码:

import tensorflow as tf
from tensorflow import keras

class MoE(keras.Model):
    def __init__(self, gating_model, expert_models):
        super(MoE, self).__init__()
        self.experts = expert_models
        for expert in self.experts:
            expert.trainable = False
        self.gating = gating_model

    def call(self, x):
        # Collect outputs from the expert models
        expert_outputs = tf.concat(
            [tf.expand_dims(expert_model(tf.expand_dims(x, axis=0)), axis=2) for expert_model in self.experts],
            axis=2
        )

        # Get gating model output which is a tensor, not weights
        gating_output = self.gating(x)
        gating_output = tf.expand_dims(gating_output, axis=-1)

        # Perform weighted sum using gating output as the weights
        return tf.reduce_sum(expert_outputs * gating_output, axis=2)

    def get_config(self):
        config = super().get_config()
        config.update({"experts": self.experts, "gating": self.gating})
        return config
© www.soinside.com 2019 - 2024. All rights reserved.