## Contents of the CURVE.DOC file

CURVE FITTER

version 1.1

by

David C. Young

For use with MSDOS computers.

Copyright 1991 David C. Young

PROGRAM FUNCTION:

This program finds both coefficients and exponents for a curve of best

fit by the least squares definition of best fit.

The program contains functions for creating and revising it's data

input files. There is also a function titled "Display any ASCII file", which

can be used to display the output files.

Each data field is read from a data file. Each data point in the file

is accompanied by a key piece of information, which uniquely identifies that

data point.

The program will read in any number of data files and only use those

data points containing information for every field.

STARTING THE PROGRAM:

Start the program by changing to the directory or drive containing the

program and typing CURVE.

The program can be installed on a hard drive by using the DOS copy

command to copy all of the files to the desired location on the hard drive.

USING THE PROGRAM:

The basic process of using this program is:

1. Create data files. The data in the data files is keyed.

An example of using keys would be if you wanted to find an equation to

predict the price of your favorite stock, you might want to enter various

items that could be used to predict, such as Gross National Product, or

the price of tea in China. The price of your stock would also be in a file

keyed by month. You would key these values by the date that they are for.

Thus you would have a file with the price of tea in China during various

months. The month would be the key value. Once you had several files, all

keyed by month, you might guess that the price of your stock varies in the

same way as the GNP times the price of tea in China.

If you don't have all of the necessary data (due to irregular

delivery of the Bejing Times), the program will still work. It will just

only take into account those months, for which you have a complete set of

data, without bothering you with the details.

2. Type in an equation. For the example above, the equation

might be S=A*G*T+B or S=A*S^C+B*T^D+E where S is the stock price, G is

the GNP, T is the price of tea in China and A through E are numbers that

you want the computer to calculate.

3. Specify which letters represent known data points to be

read from data files. S, G & T in our example are read from files.

C & D are computed exponents. A, B & E are computed coeficients.

4. Input file names for summary and analysis files.

A summary file contains the stuff put on screen at the end of the calculation,

the values for computed numbers and the average & maximum deviations (measures

of how well the equation fits the data). An analysis file contains the

key values, actual values (for your stock) and computed values. Analysis

files can be read into many popular spread sheets and graphing programs,

so that you can graph the data to better see how well the calculation

works.

5. For computed exponents, input starting values and the

number of decimal places to calculate.

The equation to be input consists of single upper or lower case

letters to represent both known pieces of data and numbers to be calculated,

along with an equals sign "=" and four mathematical operators:

"+" - addition

"*" - multiplication

"/" - division

"^" - exponentiation

Some examples of valid equations are:

a = b * c + d

a * b^c + d = e

a = b * C^d * E^e + f * g^h + i

A=B*C^D+E*F^G+H*I^J+K*L^M+N

DATA FILE FORMAT:

The data file is an ASCII file consisting of a header and up to 65000

key and data values.

The numbers identifying what type of key and data values are present

are as follows:

1 - string

2 - real

3 - character

4 - integer

The key field can be of any of these types. The data field can be

used for curve fitting only if it is of type real or integer. Integer data

fields are treated by curve fitter as real values.

The data file format is:

.

.

.

SUMMARY FILE FORMAT:

The summary file is an ASCII file, which contains exactly the same

information, which is displayed on the screen at the end of a calculation.

The summary file format is:

Average Deviation =

Maximum Deviation =

Known data points

Known

(same as line above for each data file used)

Computed Exponents (if any)

Exponent

(same as line above for each exponent computed)

Computed Coefficients

Coefficients

(same as line above for each coefficient computed)

ANALYSIS FILE FORMAT:

The analysis file is an ASCII file containing information for

comparing the known values to the calculated values for the property being

modeled.

The analysis file format is:

.

.

.

LIMITATIONS OF THE PROGRAM:

Some known deficiencies with the program that I hope to get around to

fixing in the future are.

1. Parenthesis are not allowed in the equations.

2. Constants are not allowed, although you can create a data

file with the same number for every data point if necessary.

3. The list of functions which are not supported is massive.

It starts with the trigonometric functions.

4. The exponents that are found represent local minimums

only, so pick your initial values wisely, or try a few that you think might

be in the right range.

However, to the best of my knowledge, for what this program does do,

it does it correctly.

ABOUT CURVE FITTING:

The program generally finds the coefficients of best fit for each term

in an equation and finds the exponent of best fit for any variable desired.

The exponents are gotten through a multivarible simplex minimization routine.

The coefficients are gotten at every step of the way through the matrix

algebra least squares method (mathematically equivalent to linear regression).

If your theory shows that an equation should have a particular form,

it is best to work with that form. However, if you don't know what form to

use and want to fit your data by a brute force method, here are some

suggestions:

1. Have the program find a coefficient for every term in the

equation.

2. Have the program find the constant term.

3. The generic most powerful fitting is one in which every

term consists of a fitted coefficient and a single variable with a fitted

exponent .. and then the constant term is added on. This is often the best

fit because the most parameters are being fitted.

4. Remember that you can fit anything if you are fitting more

parameters than there are items in your data set, however this fit may be

useless when applied to new data points falling between or past the original

data points. For best results always make sure that you have considerably

more data points than the number of parameters that are being fitted,

otherwise you may be fooling yourself.

LEGAL STUFF:

Version 1.1 of this program is being offered FREE to the world with no

guarantees expressed or implied ... etc, etc, etc. Version 1.1 may be freely

distributed to anyone and everyone, as long as these instructions are

kept with it.

SALES PITCH:

I have completed version 2.1 of this program. The new features

present in version 2.1 are:

1. It allows multiple character identifiers.

2. It allows user inserted parenthesis.

3. It allows the use of numerical constants.

4. It has twenty new functions covering trigonometry,

logarithms and a few others items.

If you would like to buy version 2.1, send $15.00 to:

David Young

4485 Fairlane

Okemos, MI 48864

Please, specify what size and density of disk for me to send. I will send

you one disk containing version 2.1 (or whatever the most recent version is)

containing the program and documentation files. This price does not include

any updates past the version that I send you, but I will try to keep you

informed of future versions. Please, send cash, check or money orders. I

cannot accept credit card orders. Version 2.1 is sold as is with no

guarantees expressed or implied.

If you have any questions, you can also reach me by e-mail at:

internet: [email protected]

bitnet: YOUNGDC@MSUCEM

