隐藏层激活后,我想创建一个辅助矩阵,以更好地捕获下面代码片段中数据的时间方面。返回变量的当前形状是
[out_channels, out_channels]
,但我希望返回的形状是 [input_channels, out_channels]
。应该修改代码的哪一部分来实现所需的输出,同时保持想法/逻辑不变?
def my_fun( self, H: torch.FloatTensor,) -> torch.FloatTensor:
self.input_channels = 4
self.out_channels = 16
self.forgettingFactor = 0.92
self.lamb = 0.01
self.M = torch.inverse(self.lamb*torch.eye(self.out_channels))
HH = self.calculateHiddenLayerActivation(H) # [4,16]
Ht = HH.t() # [16 , 4]
###### Computation of auxiliary matrix
initial_product = torch.mm((1 / self.forgettingFactor) * self.M, Ht) # [16, 4]
intermediate_matrix = torch.mm(HH, initial_product ) # [4, 4]
sum_inside_pseudoinverse = torch.eye(self.input_channels) + intermediate_matrix # [4, 4]
pseudoinverse_sum = torch.pinverse(sum_inside_pseudoinverse) # [4, 4]
product_inside_expression = torch.mm(HH, (1/self.forgettingFactor) * self.M) # [4, 16]
dot_product_pseudo = torch.mm( pseudoinverse_sum , product_inside_expression) # [4, 16]
dot_product_with_hidden_matrix = torch.mm(Ht, dot_product_pseudo ) # [16, 16]
res = (1/self.forgettingFactor) * self.M - torch.mm((1/self.forgettingFactor) * self.M, dot_product_with_hidden_matrix ) # [16,16]
return res
由于涉及矩阵乘法运算,您当前的逻辑结果为
[out_channels, out_channels]
形状。
矩阵乘法所得矩阵的形状取决于所涉及矩阵的形状。具体来说,如果将形状为
[m, n]
的矩阵 A 与形状为 [n, p]
的矩阵 B 相乘,所得矩阵 C 将具有形状 [m, p]
(如 torch.mm()
中所述)。
initial_product = torch.mm((1 / self.forgettingFactor) * self.M, Ht) # [16, 4]
self.M
是 [out_channels, out_channels]
,Ht
是 [out_channels, input_channels]
。initial_product
是[out_channels, input_channels]
。
intermediate_matrix = torch.mm(HH, initial_product) # [4, 4]
HH
是 [input_channels, out_channels]
,initial_product
是 [out_channels, input_channels]
。intermediate_matrix
是[input_channels, input_channels]
。
dot_product_with_hidden_matrix = torch.mm(Ht, dot_product_pseudo) # [16, 16]
Ht
是 [out_channels, input_channels]
,dot_product_pseudo
是 [input_channels, out_channels]
。dot_product_with_hidden_matrix
是[out_channels, out_channels]
。
res = (1/self.forgettingFactor) * self.M - torch.mm((1/self.forgettingFactor) * self.M, dot_product_with_hidden_matrix) # [16,16]
减法是在两个形状为
[out_channels, out_channels]
的矩阵之间进行的。
因此最终的 res
也是 [out_channels, out_channels]
。
因此,要获得最终输出的形状
[input_channels, out_channels]
,您需要调整代码中的矩阵乘法步骤,确保最终的矩阵运算涉及的矩阵的维度在相乘时会产生形状 [input_channels, out_channels]
.
def my_fun(self, H: torch.FloatTensor) -> torch.FloatTensor:
# [Previous code remains the same]
# Adjusted computation
initial_product = torch.mm((1 / self.forgettingFactor) * self.M, Ht) # [16, 4]
intermediate_matrix = torch.mm(HH, initial_product) # [4, 4]
sum_inside_pseudoinverse = torch.eye(self.input_channels) + intermediate_matrix # [4, 4]
pseudoinverse_sum = torch.pinverse(sum_inside_pseudoinverse) # [4, 4]
product_inside_expression = torch.mm(HH, (1/self.forgettingFactor) * self.M) # [4, 16]
# Modified part
res = torch.mm(pseudoinverse_sum, product_inside_expression) # [4, 16]
return res
pseudoinverse_sum
的形状为 [input_channels, input_channels]
(即 [4, 4]
)。product_inside_expression
仍然是 [input_channels, out_channels]
(即 [4, 16]
)。res
是通过将 pseudoinverse_sum
和 product_inside_expression
相乘获得的,从而得到 [input_channels, out_channels]
的形状(即 [4, 16]
)。您将删除导致
[out_channels, out_channels]
形状的最后几个步骤,以直接将 res
计算为 pseudoinverse_sum
和 product_inside_expression
的乘积。