This is long, please bomb out of it when ever it gets boring.
     Could someone enlighten me as to what is HASHING?
     It arises as an algorithm to search a table of numbers for the
existence of a particular entry.
     For example if you have a N by 2 table of random entries of the
     259 4
     34  34
     245 123
     259 4
  what is the absolutely fastest way to determine if any other random
pair of numbers is already in the table and its position.
     The problem arises in LZW compression where in you build a table of
strings of characters.
      For example
      DCBEA  ETC.
      This example is highly simplified.  The nature of these LZW tables
is that each string that is in the table consists of a PRIOR string that
is also in the table plus one more character.   The ordering however can
be very random and complex.
     Thus such an LZW table can be expressed as a two column table where
each entry consists of the index of the prior string and the one more
   1    A
   2    B
   3    C
   4    D
   5    E
   .    .
 259    DC           4  3
 260    DCB    --> 259  2
 261    DCBA       260  1
 262    DCBAE      261  5
     Thus if one has an arbitrary string such as DCB, and one has
ALREADY found it in the table at position 260, one then gets the next
character in the material to be coded, say F, and now one needs to know
if DCBF is already in the table.  If it is, one goes on to the next
character.  If not, one adds the new string DCBF to the table.
     DCBF is clearly 260 6, so the problem becomes searching through the
two column table for the entry 260 6.
     LZW tables can be relatively long, usually up to 4096 entries.  The
right hand column needs only be a byte wide, but the left hand column,
being an index into previous entries in the table needs to span 4096 or
at least two bytes.
     Thus the tables do not take up a terrible amount of space, but the
searching through them can take forever, especially if you proceed with
a straight forward linear search from the beginning to the end of the
     Obviously you only check the right column if you have found a match
in the left column.
     But remember this has to be done over and over again for each
character in the input stream to be compressed.
     For a simple 1200 by 800 fractal, the algorithm can take upwards of
8 minutes on a 486/33 pc, which is ridiculous.
     One solution to this does away with the table searching completely.
     This is done by recognizing that each ordered pair of numbers such
as (261,23) is a multiple-base number, the right column is in base 256
and the left hand column is in base 4096.
     Thus (261,23) can be expressed as a unique number of the form
     261*256 + 23 = 66839.
     The total span of ordered pairs from (0,0) to (4095,255) thus spans
single numbers from 0 to 4095*256 + 255 or 1048575.
     Thus rather than adding a new ordered pair (A,B) to the end of a
simple 2 column table, one places it directly into position A*256 + B in
the longer array.  Actually what one stores there is the ROW NUMBER of
the position that the ordered pair would have occupied in a normal
     The longer array should intially be set to to -1's.
     Thus when you wish to know if any ordered pair is in the table, you
merely compute A*256 + B and see if anything is in the longer array.  If
it is, then that entry is already 'in the table' at the implied row
position specified by the number found in the longer array.
     The brilliance of this is that LZW compression becomes very fast,
the 8 minute example mentioned above reduces down to 38 seconds, as
there is no table searching whatsoever.
     The catch is that it takes over 2 Meg of memory to store the longer
array, space that most people can ill afford.
     The original two columned table only takes 12,288 bytes, 2 bytes
for the left column and 1 for the right column, times 4096.
     I would appreciate any enlightening on this subject that you can
throw my way.