# Autofocusing system for Control Systems Project determining defocus

## Main Question or Discussion Point

Hi

For a control systems project this semester, my friends and I are working on a closed feedback image focusing system. Using a webcam, an image of a black and white checkerboard is acquired by a computer program. Then, the amount of defocus (blur) in the image is measured and quantified in terms of a scalar, which is sent (via a parallel port interface) to a stepper motor controller which actuates a stepper motor that rotates the focusing knob of the webcam thus changing the focal length until the image captured is properly focused.

So far, we have made the parallel port interface, the stepper motor controller and a fairly decent mounting arrangement. We have also theoretically analyzed the images captured by the camera. However, we are not able to get a very good estimator of the blur or defocus in the image.

The first approach we used was to convert all the images to grayscale and use a Sobel Edge detection filter on them. Then, we generate a scalar $x_{i}$ for each image $i = 1,\ldots,n$, given by

$$x_{i} = \sum_{j = 1}^{j_{max}}|E_{ij}|^2$$

where $E_{ij}$ is the vector of edges of image $i$ returned by the edge detector (in MATLAB).

We assume that the scalar is a good estimator of the sharpness of the image. If the scalar has a large value, then possibly the blur is low and vice versa.

This approach worked for a bunch of images, and we were able to plot the values to confirm that the scalar attained its maximum for the best focused image. But for another bunch of images, the scalar peaked at an image that was clearly not the best focused image.

The other approach we thought of was to inject Gaussian Blur into the images and then do a correlation analysis of the images to determine what values of mean and variance for the Gaussian estimate the "actual" blur nicely. Then we would generate a scalar based on these parameters to actuate the motor suitably.

While our hardware work is mostly completed, we're struggling to come up with a simple solution to the algorithm. Any suggestions and advice would be greatly appreciated! We have less than a week for our presentation :-(

Last edited:

Related Electrical Engineering News on Phys.org
NoTime
Homework Helper
What is the distinction between the images your idea worked with and the images it did not?
A link to the images with an explanation of what happened might help.

In general it is not possible to have all objects in a scene to be in focus at the same time.
A pertinent parameter here is depth of field.
You need to limit the focus computation to a small segment of the viewable scene with the expectation that the remainder will not be in focus.

Hi NoTime, thanks for your reply. Ok, I will try and upload the images here. Meanwhile, the problem is to focus the lens so that the most focused image of the checkerboard is captured. The scene does not consist of anything else. We are focusing on a small region of interest on the checkerboard and we want to be able to mechanically change the focal length so as to reduce the blur.

The distinction is that for two images A and B, the scalar I described in my first post peaks for image A whereas image B is more focused. This is contrary to the known mathematical property of a gradient...that it should be high for sharp intensity changes. A blurred image (A) has more gradual change in intensity across an edge of a neighboring black-white checkerboard box pair.

The problem is...how do we characterize this defocus and determine a scalar $x$ such that some function $f(x)$ is the actuation signal for the motor?

Since the checkerboard is just a matrix of alternating white and black boxes, we're assuming that the scene is isotropic, i.e. each portion of the scene is equally focused (or defocused). I understand what your point that all objects in the scene cannot be simultaneously focused, but since the idea here is to demonstrate closed feedback based correction of defocus, we're trying to keep it simple for ourselves.

NoTime
Homework Helper
You confused me with "images".
If I understand now, you only have one object the checkerboard, the images are scans of this object with different focus settings.
Is this correct?

Your problem may be the nature of a digitized scan.
For a perfect focus scan you are going to end up with a series of pixels at the junction of the light/dark squares that are half illuminated.
So the scan values might be 256, 128 and 0 for three adjoining pixels with perfect focus.
Does that help?

do you have multiple element lens and moving the distance of 1 lens to change the focal length?

This may be helpful. Check the resolution of the pixels, FOV of interest etc
http://www.edmundoptics.com/TechSupport/DisplayArticle.cfm?articleid=288 [Broken]

Last edited by a moderator:
We're using a Logitech Quickcam Express web camera. There's a rubber knob which changes the focal length, but I am not sure how exactly the focal length of the combination is changed.

By "images", I meant a sequence of images of the checkerboard. Our job is to determine the defocus and "tell" the motor how much to rotate the knob to get a better (focused) image.

images on checkboard like black-white-black-white-black-white.....

are you changing the size of the sequence image on the checkboard? or when you say focus image then the center looks focus and the sides look a bit blur? If you could post some image it would help

Here are two images of the checkerboard. Picture 41 is better focused than 31. This is the kind of output I want the system to generate. Doesn't have to be the best, but an improvement would be nice.

#### Attachments

• 12.5 KB Views: 299
• 12.9 KB Views: 328
what are the main differences between both figures? Did you change the distance from your camera to the object?

NoTime
Homework Helper
I'm curious what you think of the link edmondng supplied.
Basically, it's a more advanced explanation of the topic I mentioned.

what are the main differences between both figures? Did you change the distance from your camera to the object?
The distance between the object and the camera was not changed. Only the focusing knob was adjusted.

I'm curious what you think of the link edmondng supplied.
Basically, it's a more advanced explanation of the topic I mentioned.
Yes, I am going through it. This image sums up what want to be able to achieve, using the software program: http://www.edmundoptics.com/imagelib/techsup/ResCont1final.gif [Broken].

Last edited by a moderator:
So here's what I've done so far...

The system acquires about 900-1000 images of the checkerboard, at different focal length settings. The histogram of a focused checkerboard should have two peaks corresponding to black and white and no peaks any other color. (ideal_checkerboard.jpg)

A plot of the histogram of smoothly interpolated by a curve is generated, for each image. The area under this curve should (intuitively) be an estimator of the degree of defocus. We generate the area for each such curve and plot the area versus the image number. The value of i at which area(image i) is minimum corresponds to the most focused image.

But I am not sure if this is a good estimator.

Any other suggestions/ideas are welcome.

#### Attachments

• 5.5 KB Views: 302
chroot
Staff Emeritus
Gold Member
Normal focusing for astronomical subjects is done by simply turning the focus until an image with the brightest (numerical value) pixel is obtained.

- Warren

Warren, we have to develop a computer program to control a closed feedback autofocusing system. Starting from a relatively blurred/defocused image, a stepper motor has to turn the focusing knob of the camera until a fairly focused image of the object is obtained (after possible overshoots).

Could someone please confirm whether the following reasoning is correct:

For a perfectly focused image of a checkerboard, we have two dominant frequency components in the frequency domain. As the image becomes blurred, the Fourier Transform has contributions from frequencies between these two major frequencies, with the additional contributions arising due to defocus.

Thus, by computing the signal energy (or more precisely, the power, using Parseval's Theorem, this is just the sum of squares of the absolute values of the FT coefficients), we can get a scalar that is representative of the focus.

But the problem lies in the interpretation.

Suppose I have a set of N grayscale images and I compute the signal power for each of these images. Thus I have a vector of size N which stores the signal power of the N images. Now, if an image is perfectly focused, it has exactly two frequency components and thus in the transform, there are two impulses at those frequencies and the transform is zero everywhere else.

Should an "almost focused" or "nearly focused" image have a high signal power or a low signal power?

chroot
Staff Emeritus
Gold Member
You can certainly use a frequency-based approach, or a histogram-based approach. The histogram based approach is simpler. The more focused the image, the larger will be the count of pixels near the minimum and maximums of the histogram's range. That should be monotonic and thus appropropriate as a control signal.

- Warren

Are these two methods equivalent? And can we say that color (or shade of grayscale) and frequency are related?

NoTime