Understanding the Basics of FFT for Image Processing

AI Thread Summary
The discussion focuses on the application of Fast Fourier Transform (FFT) in image processing, particularly regarding the sampling rate and handling images that are not sized to 2^N. It highlights that pixels are treated as individual samples to compute spatial frequencies, not temporal ones. Concerns are raised about zero-padding potentially adding noise, and alternatives like wrapping the image or using window functions are suggested to mitigate this. The conversation also clarifies that Fourier transforms can apply to both time and spatial domains, emphasizing the need for proper conversion of 2D images into 1D arrays for FFT processing. Overall, the complexities of FFT implementation in image processing are underscored, particularly in relation to pixel dimensions and sampling rates.
btb4198
Messages
570
Reaction score
10
if you watch this Video :
The Julia Programming Language
at time marker 28:00, You will see that Grant takes an FFT of an Image.
In order to do a FFT, you need to know the sampling rate, but what is the sampling rate of one image?
And what if your image is not size 2^N ? Does the program just pad with zero? but that would add noise right ?
He conveniently did not explain any of this, I am very disappointed in MIT Right now.
 
Technology news on Phys.org
btb4198 said:
In order to do a FFT, you need to know the sampling rate, but what is the sampling rate of one image?
When you take the FT of an image, you treat the pixels as individual samples.
You are computing spatial frequencies, not temporal frequencies.

btb4198 said:
And what if your image is not size 2^N ? Does the program just pad with zero? but that would add noise right ?
You could pad with zeros or wrap the image around, before multiplying by a window function. The window function removes the high frequency noise due to the edge of the image.
There are also FFT algorithms that factorise the pixel dimensions, then do different radix FFTs for those factors. But they usually take longer to generate the result than an image cut down to 2n, or padded out to 2n.
 
  • Like
Likes .Scott and FactChecker
Baluncore said:
When you take the FT of an image, you treat the pixels as individual samples.
You are computing spatial frequencies, not temporal frequencies.You could pad with zeros or wrap the image around, before multiplying by a window function. The window function removes the high frequency noise due to the edge of the image.
There are also FFT algorithms that factorise the pixel dimensions, then do different radix FFTs for those factors. But they usually take longer to generate the result than an image cut down to 2n, or padded out to 2n.

Baluncore thanks.
question, so the sampling frequency Fs would be 1hz then ?
 
btb4198 said:
so the sampling frequency Fs would be 1hz then ?
No, one pixel. What is the image being projected on? You can use units of pixel size in mm or um or whatever the relevant spatial sampling rate is.
 
btb4198 said:
You will see that Grant takes an FFT of an Image.
No, Grant is doing a Fourier Transform, not a Fast Fourier Transform (FFT).
 
btb4198 said:
And what if your image is not size 2^N ?
I don't believe that any of the images were of size 2^N pixels. The image of the cat was 500 X 399 pixels.
btb4198 said:
He conveniently did not explain any of this, I am very disappointed in MIT Right now.
This was a 30+ minute lecture. You can't expect him to put in all the details in such a short timeframe.
 
btb4198 said:
Baluncore thanks.
question, so the sampling frequency Fs would be 1hz then ?
Balucore another question,

A FFT is done on a 1D-array, like a graph or a Spectrum.
So would convert the 2-D image into a 1D-array by doing :
(this is pseudocode)
image[0 ,0] = image [0] -> image[width, height] = image[size] where size is = to width * height

and then run FFT get corresponding frequencies and then place them back into the image in there correct location with respect to original indices ?

I guess that is what Julia is doing...

Baluncore said:
You could pad with zeros or wrap the image around, before multiplying by a window function. The window function removes the high frequency noise due to the edge of the image.
When you say this, do you mean, if Size is not equate to 2^N you add image[0,0] to image [size +1] untel image.lenght == 2^N?
or did I miss understand that ?
 
berkeman said:
No, one pixel. What is the image being projected on? You can use units of pixel size in mm or um or whatever the relevant spatial sampling rate is.
I believe Grant is just using pixels. Fs has to be in Hz right ? and 1/1 = 1Hz. I am not saying this right ?
Fs
fs = 1/T Hz
 
btb4198 said:
Fs has to be in Hz right ?
No! Fourier transforms apply to either time domain or spatial domain samples.
 
  • #10
berkeman said:
No! Fourier transforms apply to either time domain or spatial domain samples.
Sorry, I did not know
 
  • #11
Mark44 said:
I don't believe that any of the images were of size 2^N pixels. The image of the cat was 500 X 399 pixels.

This was a 30+ minute lecture. You can't expect him to put in all the details in such a short timeframe.
Oh, I have been trying to watch all the videos in this playlist and so far they never explain this.
 
  • #12
btb4198 said:
A FFT is done on a 1D-array, like a graph or a Spectrum.
So would convert the 2-D image into a 1D-array by doing :
The 2D transform is done by, say;
Replace each row of pixels with its FFT spatial frequency coefficients.
Then replace each column of coefficients with its FFT.
You then have 2D spatial frequencies.
Multiply the 2D spatial freq image by a coefficient mask to convolve or filter the picture.
Inverse transform the columns, then the rows.
Look at the modified image.
 
  • Like
Likes pbuk
Back
Top