Python Generating Images with CNN: Why is the Size Not Increasing in Each Layer?

  • Thread starter Thread starter BRN
  • Start date Start date
  • Tags Tags
    Images
AI Thread Summary
The discussion revolves around generating a 128x128x3 image from a vector of 100 random values using a generator model in TensorFlow. The initial issue was that the output consistently resulted in an 8x8x3 image due to incorrect configurations in the model's layers. The user identified that the output size of a transpose convolution (deconvolution) is determined by a specific formula, which was not applied correctly in the original model. The solution involved adjusting the number of layers and the kernel size, as well as changing the stride to 2, which allowed the model to produce the desired image size. The conversation also highlighted the importance of clarifying acronyms like CNN (Convolutional Neural Network) for better understanding among readers.
BRN
Messages
107
Reaction score
10
Hello everybody,
I have this problem:
starting from a vector of 100 random values, I have to generate an image of size 128x128x3 using a model consisting of a fully completely layer and 5 layer deconv.
This is my model

Python:
def generator_model(noise_dim):
   
    n_layers = 5
    k_w, k_h = [8, 8]
    input_dim = (noise_dim,)
    i_w, i_h, i_d = [8, 8, 1024] # starting filters
    strides = (1, 1)
    weight_initializer = None

    model = tf.keras.Sequential()
   
    model.add(tf.keras.layers.Dense(i_w * i_h * i_d, input_shape = input_dim, kernel_initializer = weight_initializer))
    model.add(tf.keras.layers.BatchNormalization())
    model.add(tf.keras.layers.ReLU())
   
    model.add(tf.keras.layers.Reshape((i_w, i_h, i_d)))
    for i in range(n_layers - 1):
        print(k_w, k_h)
        model.add(tf.keras.layers.Conv2DTranspose(i_d, (k_w, k_h), strides, padding = 'same', use_bias = False))
        model.add(tf.keras.layers.BatchNormalization())
        model.add(tf.keras.layers.ReLU())
        i_d = int(i_d / 2)
        k_w = int(k_w * 2)
        k_h = int(k_h * 2)
       
    k_w = i_d
    k_h = i_d
    model.add(tf.keras.layers.Conv2DTranspose(3, (k_w, k_h), strides, padding = 'same', use_bias = False))

    return model

Why do I always get an 8x8x3 image without having an increase in size in each layer?

Thank's
 
Last edited by a moderator:
Technology news on Phys.org
Ok, I have the solution.

The size of the outputs of a CNN "conv" is given by the equation

$$o=\left ( \frac{i - k + 2p}{s} \right )+1$$

but, as in my case, for a transpose convolution "deconv" the size of the outputs is

$$o=s\left (i -1 \right )+ k - 2p$$

Then, with stride ##s=2##, the correct code is this

Python:
def generator_model(noise_dim):
 
    n_layers = 4
    k_w, k_h = [16, 16] # starting kernel size
    input_dim = (noise_dim,)
    i_w, i_h, i_d = [8, 8, 1024] # starting filters
    strides = (2, 2)
    weight_initializer = None

    model = tf.keras.Sequential()
 
    model.add(tf.keras.layers.Dense(i_w * i_h * i_d, input_shape = input_dim, kernel_initializer = weight_initializer))
    model.add(tf.keras.layers.BatchNormalization())
    model.add(tf.keras.layers.ReLU())
 
    model.add(tf.keras.layers.Reshape((i_w, i_h, i_d)))
    i_d = int(i_d / 2)
    for i in range(n_layers - 1):
#        print(i_d, k_w, k_h)
        model.add(tf.keras.layers.Conv2DTranspose(i_d, (k_w, k_h), strides, padding = 'same', use_bias = False))
        model.add(tf.keras.layers.BatchNormalization())
        model.add(tf.keras.layers.ReLU())
        i_d = int(i_d / 2)
        k_w = int(k_w * 2)
        k_h = int(k_h * 2)

    model.add(tf.keras.layers.Conv2DTranspose(3, (k_w, k_h), strides, padding = 'same', use_bias = False))

    return model

And this solves my problem :smile:
 
BRN said:
And this solves my problem :smile:
Great!
In the future, it would be helpful to readers to expand acronyms such as CNN, which might not be generally known. I'm assuming it has something to do with neural networks, but that's only a guess.
 
Hello and thanks!
You're right, CNN means Convolutional Neural Network.
Next time I will write the acronyms explicitly :smile:
 
Thread 'Is this public key encryption?'
I've tried to intuit public key encryption but never quite managed. But this seems to wrap it up in a bow. This seems to be a very elegant way of transmitting a message publicly that only the sender and receiver can decipher. Is this how PKE works? No, it cant be. In the above case, the requester knows the target's "secret" key - because they have his ID, and therefore knows his birthdate.
I tried a web search "the loss of programming ", and found an article saying that all aspects of writing, developing, and testing software programs will one day all be handled through artificial intelligence. One must wonder then, who is responsible. WHO is responsible for any problems, bugs, deficiencies, or whatever malfunctions which the programs make their users endure? Things may work wrong however the "wrong" happens. AI needs to fix the problems for the users. Any way to...
Back
Top