Generating Images with CNN: Why is the Size Not Increasing in Each Layer?

BRN · Sep 20, 2022

Hello everybody,
I have this problem:
starting from a vector of 100 random values, I have to generate an image of size 128x128x3 using a model consisting of a fully completely layer and 5 layer deconv.
This is my model

Python:

def generator_model(noise_dim):
   
    n_layers = 5
    k_w, k_h = [8, 8]
    input_dim = (noise_dim,)
    i_w, i_h, i_d = [8, 8, 1024] # starting filters
    strides = (1, 1)
    weight_initializer = None

    model = tf.keras.Sequential()
   
    model.add(tf.keras.layers.Dense(i_w * i_h * i_d, input_shape = input_dim, kernel_initializer = weight_initializer))
    model.add(tf.keras.layers.BatchNormalization())
    model.add(tf.keras.layers.ReLU())
   
    model.add(tf.keras.layers.Reshape((i_w, i_h, i_d)))
    for i in range(n_layers - 1):
        print(k_w, k_h)
        model.add(tf.keras.layers.Conv2DTranspose(i_d, (k_w, k_h), strides, padding = 'same', use_bias = False))
        model.add(tf.keras.layers.BatchNormalization())
        model.add(tf.keras.layers.ReLU())
        i_d = int(i_d / 2)
        k_w = int(k_w * 2)
        k_h = int(k_h * 2)
       
    k_w = i_d
    k_h = i_d
    model.add(tf.keras.layers.Conv2DTranspose(3, (k_w, k_h), strides, padding = 'same', use_bias = False))

    return model

Why do I always get an 8x8x3 image without having an increase in size in each layer?

Thank's

BRN · Sep 22, 2022

Ok, I have the solution.

The size of the outputs of a CNN "conv" is given by the equation

$$o=\left ( \frac{i - k + 2p}{s} \right )+1$$

but, as in my case, for a transpose convolution "deconv" the size of the outputs is

$$o=s\left (i -1 \right )+ k - 2p$$

Then, with stride ##s=2##, the correct code is this

Python:

def generator_model(noise_dim):
 
    n_layers = 4
    k_w, k_h = [16, 16] # starting kernel size
    input_dim = (noise_dim,)
    i_w, i_h, i_d = [8, 8, 1024] # starting filters
    strides = (2, 2)
    weight_initializer = None

    model = tf.keras.Sequential()
 
    model.add(tf.keras.layers.Dense(i_w * i_h * i_d, input_shape = input_dim, kernel_initializer = weight_initializer))
    model.add(tf.keras.layers.BatchNormalization())
    model.add(tf.keras.layers.ReLU())
 
    model.add(tf.keras.layers.Reshape((i_w, i_h, i_d)))
    i_d = int(i_d / 2)
    for i in range(n_layers - 1):
#        print(i_d, k_w, k_h)
        model.add(tf.keras.layers.Conv2DTranspose(i_d, (k_w, k_h), strides, padding = 'same', use_bias = False))
        model.add(tf.keras.layers.BatchNormalization())
        model.add(tf.keras.layers.ReLU())
        i_d = int(i_d / 2)
        k_w = int(k_w * 2)
        k_h = int(k_h * 2)

    model.add(tf.keras.layers.Conv2DTranspose(3, (k_w, k_h), strides, padding = 'same', use_bias = False))

    return model

And this solves my problem

Mark44 · Sep 23, 2022

BRN said:

And this solves my problem

Great!
In the future, it would be helpful to readers to expand acronyms such as CNN, which might not be generally known. I'm assuming it has something to do with neural networks, but that's only a guess.

BRN · Sep 24, 2022

Hello and thanks!
You're right, CNN means Convolutional Neural Network.
Next time I will write the acronyms explicitly

Generating Images with CNN: Why is the Size Not Increasing in Each Layer?

1. How does a Convolutional Neural Network (CNN) generate images?

2. What is the difference between a traditional neural network and a CNN in terms of image generation?

3. How does a CNN learn to generate images?

4. Can a CNN generate images of any type?

5. Are there any limitations or challenges in generating images with CNNs?

Similar threads

Hot Threads

Recent Insights