Dec 052017

A little math … primes via the sieve algorithm. | |||
---|---|---|---|

File Name | File Size | Zip Size | Zip Type |

INCR.H | 26958 | 2780 | deflated |

ISPRIME.C | 26309 | 5674 | deflated |

ISPRIME.EXE | 10238 | 7421 | deflated |

PSIEVE.C | 14377 | 4592 | deflated |

PSIEVE.DOC | 7310 | 3027 | deflated |

PSIEVE.EXE | 15266 | 8783 | deflated |

TPCREAD.ME | 199 | 165 | deflated |

# Download File PSIEVE.ZIP Here

## Contents of the PSIEVE.DOC file

Psieve, a prime number sifter.

by John M. Gamble

email to: [email protected]

Copyright 1991 by John M. Gamble

Throughout this document, exponentiation is symobolized by "**". "Psieve"

may be pronounced any way you like.

psieve [many options] [upto]

The value of defaults to 2**16 - 1. It may be any number large

enough to be contained in a 32 bit unsigned integer.

-bCreate a single binary file made up of the bitmaps used to

perform the sieve. The name of the binary file is

"primes.map".

-cSplit the text files by multiples of . The

default chapter size is 65536 (2**16).

-dInstead of printing out prime numbers, print out the

differences. That is, instead of printing out 2, 3, 5, 7,

et cetera, print 1, 2, 2, 4, 2, ...

-fNumbers are printed out in fields wide. The default

size is 8.

-iPrint out an index file, listing the number of primes in

each chapter, and a running total. The index file name is

"primes.idx".

-mA debugging aid. Allocate the memory needed, then exit.

-sOnly sift with prime numbers less than .

-tReport the time in seconds it took to complete the sift.

-vVerbose reporting. Tell the version number, the map style

with which the bits are being stored, the size of the maps,

and (the most time consuming) the prime number being sifted.

-wNumbers are printed out up to columns. The default

size is 80.

-xPrint out the numbers in hexadecimal.

The text files of prime numbers are named 00000000.ptx, 00000001.ptx, et

cetera. Files of prime number differences have a ".pdf" extention, instead

of ".ptx". Each file is considered to be a separate "chapter".

Why another prime number sieve?

No, this is not best described with "Those who do not learn from history are

condemned to repeat it." I was well aware of history, but I also like to

screw around with numbers. I also needed more prime numbers than most sieve

programs were about to give me.

The standard Sieve of Erastosthenes program has an integer array, all

elements of which have been set to zero. The program loops through the

array, setting the elements that are multiples of the prime number in

question. It may print out the prime number while it is doing so, or it may

print the primes out in a separate loop.

The problem with using such arrays is that one runs out of computer memory

relatively quickly. Fortunately, the programmer's bag of tricks has a

solution, known as the bitmap. Using a bitmap gets us an eight to one

compression over a simple array of integers.

Since calloc() can only allocate as much memory at a time as specified by

it's signed integer argument (which on many machines is 16 bits, or 32675),

we will still need to allocate an array of bitmaps. But even with this

limitation, memory is used far more efficiently.

Is there any way we can improve upon this? The bitmap is organized as in

this diagram:

____________________________________________________________________________

| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | 1 | 2 | 3 | 4 |

----------------------------------------------------------------------------

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14...

In a pure sieve all numbers are checked for their primality, but in the

interests of time it is usual to skip the even numbers. But if we are going

to skip even numbers, then we don't have to store them.

____________________________________________________________________________

| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | 1 | 2 | 3 | 4 |

----------------------------------------------------------------------------

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29...

Using this scheme, we get twice the storage capacity as before, and a faster

sieve, since we don't have to sift out 2. The mapping is simple - the

number tested is divided by 2 (actually, shifted down one) - and fast. The

time spent performing the mapping is more than made up for by not having to

sift 2. Obviously, only odd numbers are tested.

This is good, but can we do better? Surprisingly, yes. I performed a

similar experiment with 3, and found that the mapping scheme works there

too.

____________________________________________________________________________

| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | 1 | 2 | 3 | 4 |

----------------------------------------------------------------------------

1 5 7 11 13 17 19 23 25 29 31 35 37 41 43...

Obviously you cannot test numbers divisible by 2 or 3, so these numbers must

be skipped. Fortunately, this is trivial. Starting with 5, you increment

by 2, then by 4, then by 2, and so forth. The mapping is also simple, since

it merely requires an integer division by 3.

The next logical step would be to try all this with 5, and it is there that

the whole thing breaks down. Sadly, a number pair that differ only by 2 can

map to the same element, such as 47 and 49, which blows the mapping scheme.

It would be nice to claim that this sprung full-blown from my head, but that

would be a lie. Bitmaps were involved from the beginning, as were the

sequence of increments that skipped multiples of 2 and 3, but compressing

multiples of 3 from the map didn't occur to me until much later.

Sequences to skip testing multiples of higher primes exist (see the file

incr.h), but being miserly about memory, I decided not to trade the speed

gain for the memory loss, no matter how small. This may change with future

enhancements, particularly with memory swapping.

Future enhancments:

o At the moment, all sifting is done in memory, which is limiting. A method

of swapping to disk should be in the works.

o A better, or an additional, compression method should be found.

o Use of an array of increments to skip multiples of 5, 7, 11, etc. could be

used (see incr.h).

o New options to the program, or new programs, when thought up.

Compile lines:

Turbo C++ v1.0:tcc -mc -I\tc\include -DMAPSTYLE=3 psieve.c

Psieve.c is written in ANSI C. The program was started on a Macintosh,

using Lightspeed C, but wound up on a PC and compiled with Turbo C. The

source code assumes that a "short int" is 16 bits, and a "long int" is 32.

So far this has held true on Macs, PCs, and VAXen, but you never know what

the next generation of machines will bring.

The preprocessor constant MAPSTYLE may be set to 1, 2, or 3, which

correspond respectively to the bitmaps with no compression, even number

compression, or 2-3 factor compression. Without the -D option on the

command line, MAPSTYLE defaults to 3. Unless you need to generate a special

bitmap file, this should be the method you compile with. MAPSTYLE = 3 is

both faster and more efficient in memory space. On a 4.77 mhz PC with a V20

chip, for example, the command "psieve -t -v 655360" brought in times of

2138, 854, and 659 seconds for MAPSTYLES of 1, 2, and 3.

A bitmap file, generated with the -b option of psieve, was used in the

creation of isprime.c

by John M. Gamble

email to: [email protected]

Copyright 1991 by John M. Gamble

Throughout this document, exponentiation is symobolized by "**". "Psieve"

may be pronounced any way you like.

psieve [many options] [upto]

The value of

enough to be contained in a 32 bit unsigned integer.

-bCreate a single binary file made up of the bitmaps used to

perform the sieve. The name of the binary file is

"primes.map".

-c

default chapter size is 65536 (2**16).

-dInstead of printing out prime numbers, print out the

differences. That is, instead of printing out 2, 3, 5, 7,

et cetera, print 1, 2, 2, 4, 2, ...

-f

size is 8.

-iPrint out an index file, listing the number of primes in

each chapter, and a running total. The index file name is

"primes.idx".

-mA debugging aid. Allocate the memory needed, then exit.

-s

-tReport the time in seconds it took to complete the sift.

-vVerbose reporting. Tell the version number, the map style

with which the bits are being stored, the size of the maps,

and (the most time consuming) the prime number being sifted.

-w

size is 80.

-xPrint out the numbers in hexadecimal.

The text files of prime numbers are named 00000000.ptx, 00000001.ptx, et

cetera. Files of prime number differences have a ".pdf" extention, instead

of ".ptx". Each file is considered to be a separate "chapter".

Why another prime number sieve?

No, this is not best described with "Those who do not learn from history are

condemned to repeat it." I was well aware of history, but I also like to

screw around with numbers. I also needed more prime numbers than most sieve

programs were about to give me.

The standard Sieve of Erastosthenes program has an integer array, all

elements of which have been set to zero. The program loops through the

array, setting the elements that are multiples of the prime number in

question. It may print out the prime number while it is doing so, or it may

print the primes out in a separate loop.

The problem with using such arrays is that one runs out of computer memory

relatively quickly. Fortunately, the programmer's bag of tricks has a

solution, known as the bitmap. Using a bitmap gets us an eight to one

compression over a simple array of integers.

Since calloc() can only allocate as much memory at a time as specified by

it's signed integer argument (which on many machines is 16 bits, or 32675),

we will still need to allocate an array of bitmaps. But even with this

limitation, memory is used far more efficiently.

Is there any way we can improve upon this? The bitmap is organized as in

this diagram:

____________________________________________________________________________

| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | 1 | 2 | 3 | 4 |

----------------------------------------------------------------------------

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14...

In a pure sieve all numbers are checked for their primality, but in the

interests of time it is usual to skip the even numbers. But if we are going

to skip even numbers, then we don't have to store them.

____________________________________________________________________________

| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | 1 | 2 | 3 | 4 |

----------------------------------------------------------------------------

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29...

Using this scheme, we get twice the storage capacity as before, and a faster

sieve, since we don't have to sift out 2. The mapping is simple - the

number tested is divided by 2 (actually, shifted down one) - and fast. The

time spent performing the mapping is more than made up for by not having to

sift 2. Obviously, only odd numbers are tested.

This is good, but can we do better? Surprisingly, yes. I performed a

similar experiment with 3, and found that the mapping scheme works there

too.

____________________________________________________________________________

| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | 1 | 2 | 3 | 4 |

----------------------------------------------------------------------------

1 5 7 11 13 17 19 23 25 29 31 35 37 41 43...

Obviously you cannot test numbers divisible by 2 or 3, so these numbers must

be skipped. Fortunately, this is trivial. Starting with 5, you increment

by 2, then by 4, then by 2, and so forth. The mapping is also simple, since

it merely requires an integer division by 3.

The next logical step would be to try all this with 5, and it is there that

the whole thing breaks down. Sadly, a number pair that differ only by 2 can

map to the same element, such as 47 and 49, which blows the mapping scheme.

It would be nice to claim that this sprung full-blown from my head, but that

would be a lie. Bitmaps were involved from the beginning, as were the

sequence of increments that skipped multiples of 2 and 3, but compressing

multiples of 3 from the map didn't occur to me until much later.

Sequences to skip testing multiples of higher primes exist (see the file

incr.h), but being miserly about memory, I decided not to trade the speed

gain for the memory loss, no matter how small. This may change with future

enhancements, particularly with memory swapping.

Future enhancments:

o At the moment, all sifting is done in memory, which is limiting. A method

of swapping to disk should be in the works.

o A better, or an additional, compression method should be found.

o Use of an array of increments to skip multiples of 5, 7, 11, etc. could be

used (see incr.h).

o New options to the program, or new programs, when thought up.

Compile lines:

Turbo C++ v1.0:tcc -mc -I\tc\include -DMAPSTYLE=3 psieve.c

Psieve.c is written in ANSI C. The program was started on a Macintosh,

using Lightspeed C, but wound up on a PC and compiled with Turbo C. The

source code assumes that a "short int" is 16 bits, and a "long int" is 32.

So far this has held true on Macs, PCs, and VAXen, but you never know what

the next generation of machines will bring.

The preprocessor constant MAPSTYLE may be set to 1, 2, or 3, which

correspond respectively to the bitmaps with no compression, even number

compression, or 2-3 factor compression. Without the -D option on the

command line, MAPSTYLE defaults to 3. Unless you need to generate a special

bitmap file, this should be the method you compile with. MAPSTYLE = 3 is

both faster and more efficient in memory space. On a 4.77 mhz PC with a V20

chip, for example, the command "psieve -t -v 655360" brought in times of

2138, 854, and 659 seconds for MAPSTYLES of 1, 2, and 3.

A bitmap file, generated with the -b option of psieve, was used in the

creation of isprime.c

December 5, 2017
Add comments