将 Conv2DTranspose 输出从 (None, 39, 39, 1) 更改为 (None, 40, 40, 1)

问题描述 投票:0回答:1

我正在使用 keras 实现解码器(一种人工神经网络):

latent_dim = 25
latent_inputs = keras.Input(shape=(latent_dim,))

x = layers.Dense(units=100, activation="relu")(latent_inputs)
x = layers.Dense(units=1024, activation="relu")(x)
x = layers.Dense(units=4096, activation="relu")(x)
x = layers.Reshape((4, 4, 256))(x)
x = layers.Conv2DTranspose(filters=256, kernel_size=3, activation="relu", strides=2, padding="same")(x)
x = layers.Conv2DTranspose(filters=128, kernel_size=3, activation="relu", strides=1, padding="same")(x)
x = layers.Conv2DTranspose(filters=128, kernel_size=3, activation="relu", strides=2, padding="same")(x)
x = layers.Conv2DTranspose(filters=64, kernel_size=3, activation="relu", strides=1, padding="same")(x)
x = layers.Conv2DTranspose(filters=64, kernel_size=3, activation="relu", strides=2, padding="same")(x)
decoder_outputs = layers.Conv2DTranspose(filters=1, kernel_size=3, activation="sigmoid", padding="same")(x)

decoder = keras.Model(latent_inputs, decoder_outputs, name="decoder")

decoder.summary()

其输出是:

Model: "decoder"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 25)]              0         
                                                                 
 dense (Dense)               (None, 100)               2600      
                                                                 
 dense_1 (Dense)             (None, 1024)              103424    
                                                                 
 dense_2 (Dense)             (None, 4096)              4198400   
                                                                 
 reshape (Reshape)           (None, 4, 4, 256)         0         
                                                                 
 conv2d_transpose (Conv2DTr  (None, 8, 8, 256)         590080    
 anspose)                                                        
                                                                 
 conv2d_transpose_1 (Conv2D  (None, 8, 8, 128)         295040    
 Transpose)                                                      
                                                                 
 conv2d_transpose_2 (Conv2D  (None, 16, 16, 128)       147584    
 Transpose)                                                      
                                                                 
 conv2d_transpose_3 (Conv2D  (None, 16, 16, 64)        73792     
 Transpose)                                                      
                                                                 
 conv2d_transpose_4 (Conv2D  (None, 32, 32, 64)        36928     
 Transpose)                                                      
                                                                 
 conv2d_transpose_5 (Conv2D  (None, 32, 32, 1)         577       
 Transpose)                                                      
                                                                 

我想调整我的模型,使

decoder_outputs
形状为
(None, 40, 40, 1)
而不是
(None, 32, 32, 1)
。这就是我尝试做的:

latent_dim = 25
latent_inputs = keras.Input(shape=(latent_dim,))

x = layers.Dense(units=100, activation="relu")(latent_inputs)
x = layers.Dense(units=1024, activation="relu")(x)
x = layers.Dense(units=1600, activation="relu")(x)  # Adjusted units to match 40*40*1
x = layers.Reshape((40, 40, 1))(x)  # Reshaped to (40, 40, 1)
x = layers.Conv2DTranspose(filters=128, kernel_size=3, activation="relu", strides=2, padding="same")(x)
x = layers.Conv2DTranspose(filters=64, kernel_size=3, activation="relu", strides=1, padding="same")(x)
x = layers.Conv2DTranspose(filters=64, kernel_size=3, activation="relu", strides=2, padding="same")(x)
decoder_outputs = layers.Conv2DTranspose(filters=1, kernel_size=3, activation="sigmoid", padding="same")(x)

decoder = keras.Model(latent_inputs, decoder_outputs, name="decoder")

decoder.summary()

但不幸的是

decoder_outputs
形状是
(None, 160, 160, 1)

你能帮我吗?

编辑

我尝试了以下解决方案:

latent_dim = 25
latent_inputs = keras.Input(shape=(latent_dim,))

x = layers.Dense(units=100, activation="relu")(latent_inputs)
x = layers.Dense(units=1024, activation="relu")(x)
x = layers.Dense(units=4096, activation="relu")(x)
x = layers.Reshape((4, 4, 256))(x)
x = layers.Conv2DTranspose(filters=256, kernel_size=3, activation="relu", strides=2, padding="same")(x)
x = layers.Conv2DTranspose(filters=128, kernel_size=3, activation="relu", strides=1, padding="same")(x)
x = layers.Conv2DTranspose(filters=128, kernel_size=3, activation="relu", strides=2, padding="valid")(x)
x = layers.Conv2DTranspose(filters=64, kernel_size=3, activation="relu", strides=1, padding="valid")(x)
x = layers.Conv2DTranspose(filters=64, kernel_size=3, activation="relu", strides=2, padding="valid")(x)
decoder_outputs = layers.Conv2DTranspose(filters=1, kernel_size=3, activation="sigmoid", padding="valid")(x)

decoder = keras.Model(latent_inputs, decoder_outputs, name="decoder")

decoder.summary()

对于某些层使用

padding="same"
,但这是我得到的输出:

Model: "decoder"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 25)]              0         
                                                                 
 dense (Dense)               (None, 100)               2600      
                                                                 
 dense_1 (Dense)             (None, 1024)              103424    
                                                                 
 dense_2 (Dense)             (None, 4096)              4198400   
                                                                 
 reshape (Reshape)           (None, 4, 4, 256)         0         
                                                                 
 conv2d_transpose (Conv2DTr  (None, 8, 8, 256)         590080    
 anspose)                                                        
                                                                 
 conv2d_transpose_1 (Conv2D  (None, 8, 8, 128)         295040    
 Transpose)                                                      
                                                                 
 conv2d_transpose_2 (Conv2D  (None, 16, 16, 128)       147584    
 Transpose)                                                      
                                                                 
 conv2d_transpose_3 (Conv2D  (None, 18, 18, 64)        73792     
 Transpose)                                                      
                                                                 
 conv2d_transpose_4 (Conv2D  (None, 37, 37, 64)        36928     
 Transpose)                                                      
                                                                 
 conv2d_transpose_5 (Conv2D  (None, 39, 39, 1)         577       
 Transpose)

如您所见,

decoder_outputs
形状现在是
(None, 39, 39, 1)
。我希望它是
(None, 40, 40, 1)
。我该如何解决?

python tensorflow keras deep-learning
1个回答
0
投票

我试过这样:

x = layers.Dense(units=100, activation="relu")(latent_inputs)
x = layers.Dense(units=1024, activation="relu")(x)
x = layers.Dense(units=4096, activation="relu")(x)
x = layers.Reshape((4, 4, 256))(x)
x = layers.Conv2DTranspose(filters=256, kernel_size=3, activation="relu", strides=2, padding="same")(x)
x = layers.Conv2DTranspose(filters=128, kernel_size=3, activation="relu", strides=1, padding="same")(x)
x = layers.Conv2DTranspose(filters=128, kernel_size=3, activation="relu", strides=2, padding="valid")(x)
x = layers.Conv2DTranspose(filters=64, kernel_size=3, activation="relu", strides=1, padding="valid")(x)
x = layers.Conv2DTranspose(filters=64, kernel_size=3, activation="relu", strides=2, padding="valid")(x)
decoder_outputs = layers.Conv2DTranspose(filters=1, kernel_size=2, activation="sigmoid", padding="valid")(x)

即对某些层使用

padding="valid"
,对最后一层使用
kernel_size=2

© www.soinside.com 2019 - 2024. All rights reserved.