Category : Science and Education
Archive : DNA.ZIP
Filename : MATRIX.HLP
Matrix 2.0 is a high resolution dot matrix
DNA sequence analysis program. Matrix 2.0
accepts sequence data from either Keyboard
or disk files and analyzes it by a variety
of user selected methods. Very detailed
dot matrixes can be produced with Epson
"Graphtrax" compatable printers such as the
MX80, MX100, FX80, FX100 and standard
IBM graphics printer. Matrix 2.0 is an
extensive revision of Matrix 1.0 ,
originally pubished in Nucleic Acids
Research 12: 767776 (1984). Matrix 2.0
supports an improved type of noise
reduction algorithm for low homology
sequence analysis and additionally has an.
on screen interactive graphics capability
for users with standard IBM graphics cards.
Using MATRIX 2.0:
The Matrix Prompt line commands are:
E(nter  DNA sequence data entry.
V(iew  Displays entered DNA sequences.
M(atrix  Starts dot matrix computation.
R(eplay  Prints matrixes stored on disk.
F(ilter  Sets computation algorithms.
H(elp  Displays on line help file.
D(ir  Shows current disk directory.
Q(uit  Exits Matrix 2.0 program.
To use Matrix 2.0, sequence data should
first be entered using the E(nter option.
Correct data entry can be verified using
the V(iew option. Analysis of the data
can be started using the M(atrix option.
F(ilter option changes analysis methods.
E(nter data option:
Matrix 2.0 expects to find its sequence
data stored in text files on disk. These
files must follow a set format:
______________________
Sequence Title > pBR322 sequence 
Blank line >  
ATGCATCGATGCATGAGGATGG
Sequence data > GGACTGGGATACATTATATGAT
Upper case, no GGAATGATGTAATCCATGTTAT
spaces. GGTTAATGCCTATCTATATATA

These files can be generated using a word
processor or Edlin text editor. Matrix
2.0 also contains a simple text editor for
entering and storeing short sequences.
This is selected by the K(eyboard option.
V(iew data option:
This option is useful for reviewing data
to be sure that it has been entered
properly. When started, it will ask:
View "A" or "B" array?
These correspond to the sequences that
were loaded into main memory using the
E(nter option used earlier. If no
sequences were loaded, then this display
will be empty.
Note that if sequences have been entered
using the K(eyboard activated text editor,
the files containing these sequences must
be reread from disk using the E(nter,
D(isk, (filename), commands to procede.
M(atrix option:
The M(atrix option has five sub options:
W(hole  Computes entire sequence ranges.
P(artial  Computes subset of sequences.
C(lose  Extreme close up showing bases.
S(can  Shows detail of sequence match.
Q(uit  Exits to main part of program.
The W(hole and P(artial commands produce
detailed dot matrixes to be viewed on dot
matrix printers or graphics mode screens.
C(lose mode shows individual bases in the
matrix. S(can shows actual allignments.
Additional information will be requested.
The computer may request coordinate data.
M(atrix option (continued):
Starting at "A" location: >
Starting at "B" location: >
WIDTH of matrix ("A" sequence):
LENGTH of matrix ("B" sequence):
"A" location
_________________  ____________________
 V_________________ 
<"B" location >< Width > 
   
  Partial Matrix  
 Whole matrix ++ 
++
The chart shows the coordinate system used
for the P(artial C(lose and S(can options.
M(atrix option (continued):
There is a choice of output options.
P(rinter  Output to dot matrix printer.
D(isk  Output to disk file only.
S(creen  (Graphics only) output to CRT.
A(bort  Cancel further processing.
The W(hole and P(artial matrixes can only
be output on Epson compatable dot matrix
printers. C(lose and S(can data can be
printed on a wide variety of printers.
The S(creen option swithes the CRT to
graphics mode and allows the user to move
a flashing cursor over the onscreen matrix
to interesting sequence allignments. These
are displayed as the cursor moves. This
works on graphics capable systems only.
R(eplay option:
This option prints out a previously
computed dot matrix that has been stored
to disk using the M(atrix to D(isk command
previously. Output must be to a Epson
compatable dot matrix printer. This
option is for use with W(hole and P(artial
matrixes only. It is useful for computing
matrixes on systems without printers for
later printing on systems with printers,
for multiple copies of a matrix, and for
situations in which printer noise is
unwanted.
F(ilter option:
This controls the algorithms used to
construct the dot matrix. The options are
Homology  exact AA,TT,GG,CC matchups:
H(igh  highly (>95%) related sequences
M(edium  medium 10070% related sequences
L(ow  lower (<70%) related sequences
Secondary Structure  AT,GC basepairing:
S(econdary  basepairing alone (does AU)
B(oth  combination H(igh+S(econdary
Both options work with RNA as well as DNA.
RNADNA secondary structure matches are
allowed.
H(igh homology algorithm:
Current matrix coordinate (Apos, )

Sequence A: ATGCATGCATGCATGCATGCATGCATGC
>
Sequence B: ATGCATGCATGCATGCATGCATGCATGC

Current Matrix coordinate ( ,Bpos)
The high homology algorithm scans forward
(increasing) from location (Apos,Bpos) to
(Apos+[basepairs],Bpos+[basepairs]) looking
for [matchups] number of correct matchups.
This is fast but will miss the end parts
at high homology/low homology boundaries.
Recomended basepair/matches settings are:
(3/3),(4/4),(5/5),(4/3),(5/4),(6/5).
M(edium homology algorithm:

Sequence A: ATGCATGCATGCATGCATGCATGCATGC
>
>
:
>
>
Sequence B: ATGCATGCATGCATGCATGCATGCATGC

To catch all boundary conditions, the
M(edium homology option scans all regions
around (Apos,Bpos) in both directions.
since more searches are done, this is slow
relative to the H(igh homology option.
Recomended basepair/matches settings are:
(3/3),(4/4),(5/5),(4/3),(5/4),(6/5).
L(ow homology algorithm:
For very low homology sequences, a new
statistical test is used to vary the user
selected [basepair] & [matches] parameters
to increase the test stringency for local
[A,T,G,C] frequency biased away from the
overall frequency distrubution [a,t,g,c].
For each coordinate:
2 2 2 2
param=param + SQRT[(Aa)+(Gg)+(Tt)+(Cc)]
This reduces the frequency of false hits
due to random AAAAAAAA and ATATATAT
matches while still being sensitive to
ATGCATGC type matches. The process is
similar to Dolby & DBX signal processing.
To get good local statistics, use larger
basepair/matchup values such as (20/12).
S(econdary Structure:
Secondary structure searches search in the
opposite direction from homology searches:
current matrix coordinate (Apos, )

Sequence A: ATGCATGCATGCATGCATGCATGCATGC
>
/
/ (AT,GC pairing)
<
Sequence B: ATGCATGCATGCATGCATGCATGCATGC

current matrix coordinate ( ,Bpos)
Apos+X is compared to BposX in this test.
Since there are often gaps in basepairing,
try (3/2),(3/3),(4/3) parameters here.
B(oth Homology and Secondary Structure:
A mixed mode test combining the H(igh
homology test and the S(econdary Structure
test. This produces an allinone matrix
which lets you see all the sequence
relationships at once. Because two
different tests are being done, the noise
level/sensitivtiy tradeoffs are a major
concern here. A good comprimise value
for (basepairs/matches) is (3/3).
Use of Compression:
Compression is normally set to 1, however
printed matrixes with a width over 960 and
screen displayed matrixes with a width over
300 must be computed using compression > 1.
___________________
 . . . . . . . .  __________
  ........
 . . . . . . . .  > ........
  ........
 . . . . . . . .  ++
  compressed matrix
++ (2X compression)
Normal matrix
For compression X, the program tests in
increments of (Apos+X,Bpos+X). F(ilter
parameters may need to be readjusted.
H(elp option:
Displays the help file that you are
reading.
D(ir option:
Displays the Disk directory of the
currently logged Disk Drive. It also
displays the dates that the disk files
were created on and the number of
Kilobytes of space that they take up.
Q(uit option:
Exits Matrix 2.0 to DOS. All data not
saved on disk will be lost.
General Tips on Program Use:
Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!
This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.
But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/