# Category : Science and Education

Archive : SORT-OUT.ZIP

Filename : PAPER.ASC

Henning Hansen and Niels Jrgen Christensen,

Technical University of Denmark

Abstract

We present a Forth program for analysis and visualization of sorting

algorithms. The program includes a comprehensive collection of implemented

sorting algorithms, and two new algorithms has been developed. Trade-offs

in the selection of sorting algorithms are illustrated with an example from

computer graphics.

Introduction

Sorting is a classical discipline in computer science, and the study of

sorting algorithms and implementations illustrates different approaches and

strategies for solving a well defined problem. Knowledge of sorting

algorithms may provide inspiration for problem solving in other branches of

computer science, as well as be the basis for selecting the right method for

a specific task. At the end of the paper, this point will be illustrated

with an example from computer graphics, which is the field from where this

work originated.

The paper describes a program, that has been developed for the purpose of

analysis and visualization of sorting algorithms. The program serves a

number of purposes, some of which are described in the following sections.

The program is placed in the public domain.

Algorithm library

About every well known general purpose sorting algorithm has been

implemented as part of the program. It may therefore serve as a source from

which to select working implementations of sorting algorithms, that are in

choice for an application. This makes it easy to try out a number of

different algorithms in the application before deciding which one

to actually use.

All of the algorithms are implemented, so that they sort data in-place, and

only interact with the data through two operations: compare two items to

test which is the greater one, and exchange two items to change the order

of the data. Some algorithms make use of an optional move operation for

better performance, but this can easily be implemented in terms of exchange

operations. These operations make it very easy to interface the sorting

program to new data representations.

Sorting analysis

The algorithms can be tested for a selected number of data, and for

different types of initial order, made up with a random number generator.

Statistics for the number of compare, exchange and move operations are

calculated. One measure of the "sortedness" of data is the number of

inversions in the file, and the program has a procedure, that generates

data that approximates any percentage of inversions selected by the user.

The data generation procedure can also define data with multiple key

values (more items with the same key value).

These methods allow you to test, whether the algorithms perform as they are

theoretically expected to do for differently ordered data, and to see which

one will make a good choice for a possible application, if data are known to

have a certain distribution.

Development of new algorithms

It is very easy to add a new algorithm to the program, and the program will

serve as a good environment for debugging and testing of the implementation.

The simplest way to test a sorting algorithm is by sorting a string into

ascii order, while printing out information that illustrates the execution

of compare and exchange operations, and lists the actual string content

during the sorting process. The program lets you input a string of your own

choice, and either single step through the sort or let it run to an end.

Every compare, exchange or move operation performed on the data is

automatically traced on the screen, since tracing procedures as well as

single stepping are embedded the operations. If the word TRACE is placed

inside a major loop of the program, it will show the contents of the string

every time it is executed. This is the only modification of the

implementation needed to fully support the integration of the sorting

algorithm into the program. This type of analysis is very good for

debugging of a new sorting implementation, because you can follow every

step of the algorithm, and choose examples that test special cases.

The user will find two new-developed sorting algorithms with the program,

called Stacksort (Forth programmers are expected to like that name) and

Splicesort. Stacksort is a recursive variation of Selectionsort, and

Splicesort is based on ideas from Mergesort and Shellsort.

Unlike in other languages, recursion has no execution overhead in Forth.

This makes Stacksort especially well suited for implementation in Forth, and

it is probably the fastest algorithm for sorting almost presorted data in

Forth. The main problem is the depth of recursive calls, which may overflow

either the data or return stack. For almost sorted data, which is the

best-case for Stacksort, the recursion depth comes close to the number of

data elements.

Splicesort works much like Mergesort, except that it creates sorted

subfiles, that are interweaved in a way similar to a Shellsort with

increments equal to the powers of two. Two variations of Splicesort can be

constructed by using local insertion or selection, but implementation gets

rather complicated.

We have not focused on optimization, but some algorithms may be more

efficient than their statistically behaviour indicates, because the code

is simpler compared to others. For example, if some data can be sorted with

either Stacksort or Splicesort with about the same number of operations,

Stacksort will probably run faster because its code is simpler. However, if

implemented in a language that does not allow recursion without much

execution overhead, this may not be true.

Visualization of sorting

When your sorting algorithm has been succesfully implemented, its behaviour

can best be studied and explained by the use of graphics. While the ascii

order of a string of characters does not show any special graphics effect,

it is very easy to tell if a series of colours are ordered into spectral

order. The whole sorting process can be illustrated in one picture by a

sequence of rows, that shows the order of the colours at different times

during the process.

A variation of this visialization technique is to show only colours, when

they are compared or exchanged, so the picture will show which part of the

data are accessed by the algorithm at different stages of the process.

The third type of graphics that are provided by the program, shows a series

of columns, while they are interactively being sorted by height. This does

not depend on colours, and is well suited for black and white illustrations.

A special feature of the column sort is, that columns of equal height can

be colored from left to right before sorting. If the sorting algorithm is

stable, the order of the colours will be maintained within all sets of

columns of equal height. The column sort can be stopped for inspection

at any stage with the single stepping facility, which is also active

during the graphics visualizations.

Example from computer graphics

To illustrate some trade-offs in selecting a sorting algorithm for a real

application, we have chosen an example from computer graphics. Algorithms

for hidden line elimination from the visualization of a three-dimensional

scene often make use of sorted lists of polygons. The scene itself consists

of an unordered (ie. randomly ordered) set of polygons, and during hidden

line elimination an algorithm may proceed back to front through space

overpainting polygons by closer ones, or front to back while clipping

polygons against closer visible ones. In both cases, two sorted lists of

polygons will be useful: one list of the polygons sorted with respect to

the distance from the eyepoint to the nearest point on the polygon, and

another list of the same polygons sorted with respect to the distance to

their furthest point. When proceeding front to back through space, the

hidden line algorithm can now maintain a list of polygons at the current

depth simply by adding new polygons in the order of the first list, when

passing their nearest point, and deleting polygons in the order of the

second list, when the furthest point is reached, and vice versa for back to

front processing of the polygons. Our problem is which sorting algorithms

to choose for creation of the two lists.

The first thing to consider is which operation is slower, comparing two

polygons or interchanging them. Since we do not want to move polygons

around in space, the items to be sorted are pointers to the polygons, and

therefore exchange operations are fast. However, when two polygons need be

compared, the values of their minimal/maximal distance from the eye must be

accessed and compared. Since these values will often be real numbers,

comparing two polygons will be significantly slower than exchanging two

polygon pointers. This excludes all sorting algorithms based on selection,

while good choices would be Quicksort, Shellsort or Splicesort. If the lists

are represented as linked lists, a move operation will be fast, and Binary

Insertionsort or Mergesort are better.

The above considerations were for randomly sorted data. However, when one

of the two lists has been calculated, the other will probably be very much

identical, especially if the sizes of the polygons are small compared to the

distances between them. So if the second list is produced from the already

made first list, we can take advantage of the fact, that the polygons are

almost presorted. Straight Insertionsort will perform very well in this

case, and if the depth of recursion is acceptable, try out Stacksort. Do

not use Quicksort.

Conclusion

The program has proved an efficient workbench for the development and

testing of the sorting algorithms described in this paper. It is well suited

for self study, and the program has been used as a supplementary tool for

teaching about sorting methods at the Technical University of Denmark.

A longer version of this paper, including source code for the sorting

algorithms, will be submitted to the Journal of Forth Applications and

Research. The paper is based on work done under a grant from the Danish

National Research Counsil.

Bibliography

Sorting algorithms (except Splicesort and Stacksort) are described in the

following books:

D. E. Knuth, The Art of Computer Programming, volume 3: Sorting and

Searching, Adddison-Wesley, 1975.

H. Lorin, Sorting and Sort Systems, Adddison-Wesley, 1975.

N. Wirth, Algorithms + Data Structures = Programs, Prentice-Hall 1976.

R. Sedgewick, Algorithms, Adddison-Wesley, 1988.

Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/