Dec 052017
 
A little math ... primes via the sieve algorithm.
File PSIEVE.ZIP from The Programmer’s Corner in
Category Recently Uploaded Files
A little math … primes via the sieve algorithm.
File Name File Size Zip Size Zip Type
INCR.H 26958 2780 deflated
ISPRIME.C 26309 5674 deflated
ISPRIME.EXE 10238 7421 deflated
PSIEVE.C 14377 4592 deflated
PSIEVE.DOC 7310 3027 deflated
PSIEVE.EXE 15266 8783 deflated
TPCREAD.ME 199 165 deflated

Download File PSIEVE.ZIP Here

Contents of the PSIEVE.DOC file


Psieve, a prime number sifter.
by John M. Gamble
email to: [email protected]

Copyright 1991 by John M. Gamble

Throughout this document, exponentiation is symobolized by "**". "Psieve"
may be pronounced any way you like.

psieve [many options] [upto]

The value of defaults to 2**16 - 1. It may be any number large
enough to be contained in a 32 bit unsigned integer.

-bCreate a single binary file made up of the bitmaps used to
perform the sieve. The name of the binary file is
"primes.map".

-cSplit the text files by multiples of . The
default chapter size is 65536 (2**16).

-dInstead of printing out prime numbers, print out the
differences. That is, instead of printing out 2, 3, 5, 7,
et cetera, print 1, 2, 2, 4, 2, ...

-fNumbers are printed out in fields wide. The default
size is 8.

-iPrint out an index file, listing the number of primes in
each chapter, and a running total. The index file name is
"primes.idx".

-mA debugging aid. Allocate the memory needed, then exit.

-sOnly sift with prime numbers less than .

-tReport the time in seconds it took to complete the sift.

-vVerbose reporting. Tell the version number, the map style
with which the bits are being stored, the size of the maps,
and (the most time consuming) the prime number being sifted.

-wNumbers are printed out up to columns. The default
size is 80.

-xPrint out the numbers in hexadecimal.

The text files of prime numbers are named 00000000.ptx, 00000001.ptx, et
cetera. Files of prime number differences have a ".pdf" extention, instead
of ".ptx". Each file is considered to be a separate "chapter".



Why another prime number sieve?

No, this is not best described with "Those who do not learn from history are
condemned to repeat it." I was well aware of history, but I also like to
screw around with numbers. I also needed more prime numbers than most sieve
programs were about to give me.

The standard Sieve of Erastosthenes program has an integer array, all
elements of which have been set to zero. The program loops through the
array, setting the elements that are multiples of the prime number in
question. It may print out the prime number while it is doing so, or it may
print the primes out in a separate loop.

The problem with using such arrays is that one runs out of computer memory
relatively quickly. Fortunately, the programmer's bag of tricks has a
solution, known as the bitmap. Using a bitmap gets us an eight to one
compression over a simple array of integers.

Since calloc() can only allocate as much memory at a time as specified by
it's signed integer argument (which on many machines is 16 bits, or 32675),
we will still need to allocate an array of bitmaps. But even with this
limitation, memory is used far more efficiently.

Is there any way we can improve upon this? The bitmap is organized as in
this diagram:

____________________________________________________________________________
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | 1 | 2 | 3 | 4 |
----------------------------------------------------------------------------
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14...

In a pure sieve all numbers are checked for their primality, but in the
interests of time it is usual to skip the even numbers. But if we are going
to skip even numbers, then we don't have to store them.

____________________________________________________________________________
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | 1 | 2 | 3 | 4 |
----------------------------------------------------------------------------
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29...

Using this scheme, we get twice the storage capacity as before, and a faster
sieve, since we don't have to sift out 2. The mapping is simple - the
number tested is divided by 2 (actually, shifted down one) - and fast. The
time spent performing the mapping is more than made up for by not having to
sift 2. Obviously, only odd numbers are tested.

This is good, but can we do better? Surprisingly, yes. I performed a
similar experiment with 3, and found that the mapping scheme works there
too.

____________________________________________________________________________
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | 1 | 2 | 3 | 4 |
----------------------------------------------------------------------------
1 5 7 11 13 17 19 23 25 29 31 35 37 41 43...

Obviously you cannot test numbers divisible by 2 or 3, so these numbers must
be skipped. Fortunately, this is trivial. Starting with 5, you increment
by 2, then by 4, then by 2, and so forth. The mapping is also simple, since
it merely requires an integer division by 3.

The next logical step would be to try all this with 5, and it is there that
the whole thing breaks down. Sadly, a number pair that differ only by 2 can
map to the same element, such as 47 and 49, which blows the mapping scheme.

It would be nice to claim that this sprung full-blown from my head, but that
would be a lie. Bitmaps were involved from the beginning, as were the
sequence of increments that skipped multiples of 2 and 3, but compressing
multiples of 3 from the map didn't occur to me until much later.

Sequences to skip testing multiples of higher primes exist (see the file
incr.h), but being miserly about memory, I decided not to trade the speed
gain for the memory loss, no matter how small. This may change with future
enhancements, particularly with memory swapping.

Future enhancments:

o At the moment, all sifting is done in memory, which is limiting. A method
of swapping to disk should be in the works.

o A better, or an additional, compression method should be found.

o Use of an array of increments to skip multiples of 5, 7, 11, etc. could be
used (see incr.h).

o New options to the program, or new programs, when thought up.


Compile lines:

Turbo C++ v1.0:tcc -mc -I\tc\include -DMAPSTYLE=3 psieve.c

Psieve.c is written in ANSI C. The program was started on a Macintosh,
using Lightspeed C, but wound up on a PC and compiled with Turbo C. The
source code assumes that a "short int" is 16 bits, and a "long int" is 32.
So far this has held true on Macs, PCs, and VAXen, but you never know what
the next generation of machines will bring.

The preprocessor constant MAPSTYLE may be set to 1, 2, or 3, which
correspond respectively to the bitmaps with no compression, even number
compression, or 2-3 factor compression. Without the -D option on the
command line, MAPSTYLE defaults to 3. Unless you need to generate a special
bitmap file, this should be the method you compile with. MAPSTYLE = 3 is
both faster and more efficient in memory space. On a 4.77 mhz PC with a V20
chip, for example, the command "psieve -t -v 655360" brought in times of
2138, 854, and 659 seconds for MAPSTYLES of 1, 2, and 3.

A bitmap file, generated with the -b option of psieve, was used in the
creation of isprime.c


 December 5, 2017  Add comments

Leave a Reply