Category : Science and Education
Archive   : EYESIG.ZIP
Filename : VISION.DOC

Output of file : VISION.DOC contained in archive : EYESIG.ZIP

A Quick Introduction to Computer Vision

Loren Heiny
Robots, Etc
P.O. Box 122
Tempe, AZ 85280

(A version of this article first appeared in the Winter
1989 issue of Robot Review, Vol 2, No. 1, pp. 1-5.,
950 Trout Brook Drive, West Hartford, CT 06119.)

Vision is amazing. When you sit back and think about how

much detail and information is almost instantly available to

us as we gaze around, you begin to appreciate the real

complexity and amazing power of our vision system.

As a robot experimenter, I look forward to the day when

my robot can see what I can. Unfortunately, we are far from

this goal. Part of the problem is the lightning fast

computing speed that is needed, but what it really comes

down to is that we simply don't know yet how to build such a

vision system.

Despite this, researchers have developed volumes of

computer vision algorithms, ideas, and strategies. Some are

even useful. In this article we'll take a quick tour of two

computer vision techniques which are often used to implement

vision systems that recognize objects -- region analysis and

edge detection. These two techniques represent only a small

portion of the vision field, but they will introduce you to

some of the basic concepts involved in computer vision so that

you can begin thinking about a vision system for your own

computer or robot.

What's in an Image?

A computer image is made up of (usually) tens of thousands

of tiny elements, called pixels which are much like the tiny

dots that make up the picture on your TV set. In computer

vision analysis, it is up to the computer to scan through

these thousands of pixels and figure out *what objects are where*.

Unfortunately, each pixel taken by itself, does not

really provide much information. In fact, all it does

indicate is the average intensity of light at its location

in the scene. Converting these intensity values into real-

world objects is no easy chore!

Object Recognition

Recognizing objects is an extremely important part of

computer vision. There are two techniques commonly used --

region analysis and edge detection. Although both are used

in recognizing objects, they each go about it by from two

seemingly opposite directions. Region analysis, on the one

hand, attempts to recognize objects by grouping regions of

like pixels and then trying to match these regions with the

shapes and sizes of known objects. In contrast, edge

detection attempts to recognize objects by locating their

edges by looking for unlike pixels and then basing its

recognition upon the contours of these edge points. Neither

technique is fool proof, but each has found acceptance in

many areas. Let's begin by looking at region analysis more


Region Analysis

Region analysis, in principle, is very simple -- regions of

like pixels are grouped together to form objects. Usually,

an image is scanned from top to bottom, comparing each pixel

intensity with its neighbors. If adjacent pixels are

similar, then they are assumed to belong to the same object,

otherwise the pixels are assumed to be located on different

objects or part of the background.

Probably the most successful form of region analysis

is called binary region analysis. Here, images of objects

laying on a highly contrasting background are first

converted so that they only contain two intensity values (or

gray-levels as they are more commonly called) -- black and

white. This process is called thresholding because all

intensity values above a particular value, the threshold,

are set to one gray level (white, for instance) and all

others to a second value (in this case, black). As long as

all objects contrast with their background, they will

become one intensity and the background the other.

Once a binary image has been generated it is a matter of

grouping all connected pixels that are similar into regions

so that they can be matched with known objects. The matching

process is usually based on such things as the area,

"color," perimeter, and number of holes of the regions.

To make the technique work well, however, several

conditions usually must be met. First, there must be good

contrast between the objects and the background. Second, in

order to maintain this contrast, many times the lighting

must be known or at least controlled. And thirdly,

recognition of objects usually requires that they are always

viewed from the same orientation and perspective.

Although, each of these conditions can become very

restrictive, they aren't always a problem. For instance, on

assembly lines with overhead cameras and controlled

lighting binary region analysis can be very reliable and


One of the reasons binary imagery is used is because it

is simple to implement. Therefore, it is worthwhile to try

to find some uses for it in our household robots. One

possibility, is to use binary region analysis to recognize

highly contrasting objects on a wall. For instance, your

robot might be able to recognize a dark door handle on a

white door by recognizing its circular contour and general

height above the floor. Another possibility is to "read"

special black-and-white markers that you have placed around

the house so your robot can see where it is.


EyeSight includes a binary region analysis function, called

Contour, which is located in the Exec pull-down menu. To run

this routine select the Contour command as outlined in the

EyeSight manual. It will display a slide-bar from which you

can select a threshold value which will be used to convert

the current image to binary form. (Use the left and right

arrow keys to select a threshold value and then press the

Enter key to accept it and begin processing.) Once the

threshold is chosen the binary version of the image will be

displayed followed by the outlines of the regions that were

found in the image.

You can examine several of the statistics accumulated

on each region by pressing the ALT-D key combination. This

will cause a pop-up window to appear with the region

information for the region currently marked by a crosshair.

Use the up and down arrow keys to sequence through the

various regions in the image. Press ESC when you are done

and want to continue with EyeSight.


Edge Detection

There are many computer vision programs that base

their analysis not on the regions found in an image but

rather its edges. The basic idea is to scan the image

looking for significant changes in neighboring pixels.

This approach assumes that changes in adjacent pixels

will occur at the edges of objects. Therefore, once these

edge points are located, they can be chained together

and used to recognize what object they come from. Once

again, this is no easy chore.

First, it is generally not easy to locate all of the

object edges in an image. In addition, it turns out that

many image edge points may not correspond to actual object

edges at all. For instance, shadows, texture, and lighting

variations can all create strong edges in an image that

obviously do not correspond to any real object edges.

Secondly, it is often very difficult to link edge points

correctly. Many times not all the edge points of an

object are strong enough to be detected. Therefore the

contour of the object is broken. Consequently, it is often

necessary to have to link small chains of non-connecting

edge points together to produce a complete chain of the

object's contour. This can be difficult because it is often

a challenge to decide if two chains should really be

connected or not.



EyeSight includes three edge detectors, each which is

named after its developer: Roberts, Sobel, and Prewitt.

Each edge detector uses a slightly different method to compare

neighboring pixels, although they all generate an edge

gradient image which indicates how strong edges are at

each location in an image.

The Roberts edge detector is one of EyeSight's standard

functions and appears under the Exec pull-down menu. The Sobel

and Prewitt edge detectors, however, must be added to the

Exec menu before they can be used. This is accomplished by

selecting them from the Add Function command contained in

the Option menu. (Refer to the EyeSight manual for more

information on how to do this.)

Each of the edge detectors, when selected, will display

a gradient image for the current image in EyeSight as

described in the EyeSight manual. In general, the stronger

the edges are in the gradient image, the brighter the pixels

will be. You may want to compare the results of each of the edge

detectors as described in the section EXPERIMENTING WITH EYESIGHT

in the EyeSight manual.


Line Finding

One common extension to edge detection is to combine

chains of edge points into lines and then base the image

analysis on these lines. This process, called line finding,

is quite attractive because it often simplifies object

recognition since there are far fewer lines in an image to

match or identify than edge points. In addition, line

finding often leads to fewer mis-matches and faster




EyeSight includes the Burns line finding algorithm that

finds lines in an image by grouping regions of like *edge*

pixels and fitting lines to these regions. The Burns function

is located under the Exec pull-down menu. To find lines in

an image all you need to do is select the Burns command

(assuming you've already read in an image to the EyeSight

environment). Try experimenting with several different images

and note the number and quality of the lines found in each.

Alternatively, try guessing what is in the original image

based on the lines you see on the screen. This can be a

real challenge!


Other Techniques and Closing Comments

So far we have briefly discussed some of the issues involved

in object recognition, but there are many more computer

vision techniques that have been developed over the years.

For example, researchers have used texture, shading, color,

focus, modeling, multiple image analysis (motion and

stereo), light striping, and much, much more to attempt to

interpret images.

However, despite all of this effort, no one can yet

claim to have a vision system that can even remotely rival

our own. This is still a challenge yet to be met. Of course,

it does need to be said that a robot (or your computer) does

not necessarily have to be able to see the same way we do.

For example, it's probably not desirable to intentionally

program our robots to experience the same optical illusions

that we do--or is this unavoidable in a vision system that

equals our own?

  3 Responses to “Category : Science and Education
Archive   : EYESIG.ZIP
Filename : VISION.DOC

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: