Category : Science and Education
Archive   : DNA.ZIP
Filename : MATRIX.HLP

 
Output of file : MATRIX.HLP contained in archive : DNA.ZIP
MATRIX Help file:

Matrix 2.0 is a high resolution dot matrix
DNA sequence analysis program. Matrix 2.0
accepts sequence data from either Keyboard
or disk files and analyzes it by a variety
of user selected methods. Very detailed
dot matrixes can be produced with Epson
"Graphtrax" compatable printers such as the
MX-80, MX-100, FX-80, FX-100 and standard
IBM graphics printer. Matrix 2.0 is an
extensive revision of Matrix 1.0 ,
originally pubished in Nucleic Acids
Research 12: 767-776 (1984). Matrix 2.0
supports an improved type of noise
reduction algorithm for low homology
sequence analysis and additionally has an.
on screen interactive graphics capability
for users with standard IBM graphics cards.
Using MATRIX 2.0:

The Matrix Prompt line commands are:

E(nter - DNA sequence data entry.
V(iew - Displays entered DNA sequences.
M(atrix - Starts dot matrix computation.
R(eplay - Prints matrixes stored on disk.
F(ilter - Sets computation algorithms.
H(elp - Displays on line help file.
D(ir - Shows current disk directory.
Q(uit - Exits Matrix 2.0 program.

To use Matrix 2.0, sequence data should
first be entered using the E(nter option.
Correct data entry can be verified using
the V(iew option. Analysis of the data
can be started using the M(atrix option.
F(ilter option changes analysis methods.
E(nter data option:

Matrix 2.0 expects to find its sequence
data stored in text files on disk. These
files must follow a set format:
______________________
Sequence Title -> |pBR322 sequence |
Blank line -> | |
|ATGCATCGATGCATGAGGATGG|
Sequence data -> |GGACTGGGATACATTATATGAT|
Upper case, no |GGAATGATGTAATCCATGTTAT|
spaces. |GGTTAATGCCTATCTATATATA|
----------------------

These files can be generated using a word
processor or Edlin text editor. Matrix
2.0 also contains a simple text editor for
entering and storeing short sequences.
This is selected by the K(eyboard option.
V(iew data option:

This option is useful for reviewing data
to be sure that it has been entered
properly. When started, it will ask:

View "A" or "B" array?

These correspond to the sequences that
were loaded into main memory using the
E(nter option used earlier. If no
sequences were loaded, then this display
will be empty.

Note that if sequences have been entered
using the K(eyboard activated text editor,
the files containing these sequences must
be re-read from disk using the E(nter,
D(isk, (filename), commands to procede.
M(atrix option:

The M(atrix option has five sub options:

W(hole - Computes entire sequence ranges.
P(artial - Computes subset of sequences.
C(lose - Extreme close up showing bases.
S(can - Shows detail of sequence match.
Q(uit - Exits to main part of program.

The W(hole and P(artial commands produce
detailed dot matrixes to be viewed on dot
matrix printers or graphics mode screens.
C(lose mode shows individual bases in the
matrix. S(can shows actual allignments.

Additional information will be requested.
The computer may request coordinate data.

M(atrix option (continued):

Starting at "A" location: >
Starting at "B" location: >

WIDTH of matrix ("A" sequence):
LENGTH of matrix ("B" sequence):

"A" location
_________________ | ____________________
| V_________________ |
|<-"B" location ->|<-- Width ----->| |
| | | |
| | Partial Matrix | |
| Whole matrix +----------------+ |
+--------------------------------------+

The chart shows the coordinate system used
for the P(artial C(lose and S(can options.
M(atrix option (continued):

There is a choice of output options.

P(rinter - Output to dot matrix printer.
D(isk - Output to disk file only.
S(creen - (Graphics only) output to CRT.
A(bort - Cancel further processing.

The W(hole and P(artial matrixes can only
be output on Epson compatable dot matrix
printers. C(lose and S(can data can be
printed on a wide variety of printers.
The S(creen option swithes the CRT to
graphics mode and allows the user to move
a flashing cursor over the on-screen matrix
to interesting sequence allignments. These
are displayed as the cursor moves. This
works on graphics capable systems only.
R(eplay option:

This option prints out a previously
computed dot matrix that has been stored
to disk using the M(atrix to D(isk command
previously. Output must be to a Epson
compatable dot matrix printer. This
option is for use with W(hole and P(artial
matrixes only. It is useful for computing
matrixes on systems without printers for
later printing on systems with printers,
for multiple copies of a matrix, and for
situations in which printer noise is
unwanted.





F(ilter option:

This controls the algorithms used to
construct the dot matrix. The options are

Homology - exact A-A,T-T,G-G,C-C matchups:

H(igh - highly (>95%) related sequences
M(edium - medium 100-70% related sequences
L(ow - lower (<70%) related sequences

Secondary Structure - A-T,G-C basepairing:

S(econdary - basepairing alone (does A-U)
B(oth - combination H(igh+S(econdary

Both options work with RNA as well as DNA.
RNA-DNA secondary structure matches are
allowed.
H(igh homology algorithm:

Current matrix coordinate (Apos, )
|
Sequence A: ATGCATGCATGCATGCATGCATGCATGC
---->
Sequence B: ATGCATGCATGCATGCATGCATGCATGC
|
Current Matrix coordinate ( ,Bpos)

The high homology algorithm scans forward
(increasing) from location (Apos,Bpos) to
(Apos+[basepairs],Bpos+[basepairs]) looking
for [matchups] number of correct matchups.
This is fast but will miss the end parts
at high homology/low homology boundaries.
Recomended basepair/matches settings are:
(3/3),(4/4),(5/5),(4/3),(5/4),(6/5).

M(edium homology algorithm:

|
Sequence A: ATGCATGCATGCATGCATGCATGCATGC
---->
---->
:
---->
---->
Sequence B: ATGCATGCATGCATGCATGCATGCATGC
|

To catch all boundary conditions, the
M(edium homology option scans all regions
around (Apos,Bpos) in both directions.
since more searches are done, this is slow
relative to the H(igh homology option.
Recomended basepair/matches settings are:
(3/3),(4/4),(5/5),(4/3),(5/4),(6/5).
L(ow homology algorithm:

For very low homology sequences, a new
statistical test is used to vary the user
selected [basepair] & [matches] parameters
to increase the test stringency for local
[A,T,G,C] frequency biased away from the
overall frequency distrubution [a,t,g,c].
For each coordinate:
2 2 2 2
param=param + SQRT[(A-a)+(G-g)+(T-t)+(C-c)]

This reduces the frequency of false hits
due to random AAAA-AAAA and ATAT-ATAT
matches while still being sensitive to
ATGC-ATGC type matches. The process is
similar to Dolby & DBX signal processing.
To get good local statistics, use larger
basepair/matchup values such as (20/12).
S(econdary Structure:

Secondary structure searches search in the
opposite direction from homology searches:

current matrix coordinate (Apos, )
|
Sequence A: ATGCATGCATGCATGCATGCATGCATGC
----->
|/
/| (A-T,G-C pairing)
<-----
Sequence B: ATGCATGCATGCATGCATGCATGCATGC
|
current matrix coordinate ( ,Bpos)

Apos+X is compared to Bpos-X in this test.
Since there are often gaps in basepairing,
try (3/2),(3/3),(4/3) parameters here.
B(oth Homology and Secondary Structure:

A mixed mode test combining the H(igh
homology test and the S(econdary Structure
test. This produces an all-in-one matrix
which lets you see all the sequence
relationships at once. Because two
different tests are being done, the noise
level/sensitivtiy tradeoffs are a major
concern here. A good comprimise value
for (basepairs/matches) is (3/3).








Use of Compression:

Compression is normally set to 1, however
printed matrixes with a width over 960 and
screen displayed matrixes with a width over
300 must be computed using compression > 1.
___________________
| . . . . . . . . | __________
| | |........|
| . . . . . . . . | -----> |........|
| | |........|
| . . . . . . . . | +--------+
| | compressed matrix
+-----------------+ (2X compression)
Normal matrix

For compression X, the program tests in
increments of (Apos+X,Bpos+X). F(ilter
parameters may need to be readjusted.
H(elp option:

Displays the help file that you are

reading.















D(ir option:

Displays the Disk directory of the
currently logged Disk Drive. It also
displays the dates that the disk files
were created on and the number of
Kilobytes of space that they take up.












Q(uit option:

Exits Matrix 2.0 to DOS. All data not
saved on disk will be lost.















General Tips on Program Use:











  3 Responses to “Category : Science and Education
Archive   : DNA.ZIP
Filename : MATRIX.HLP

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/