endless compression

John Chambers jc at trillian.mit.edu
Sun Aug 15 16:02:01 EDT 2004


Seth Gordon wrote:
| It is inarguable that if, by pure random chance, the bytes in a compressed
| stream are identical to the output bytes of a specific random number
| generator with a specific seed, an astonishing amount of compression could
| be had.
| ... (brute-force search for the generator and seed) ...

Some years back, there was a science fiction  story  (I've  forgotten
its  title)  involving  an  alien  transmission that used expressions
like:

  53^9327-37^2903+5^738

The idea was that such an expression, while requiring very few bytes,
would  expand  to  a  rather large numeric value, whose bits were the
actual message.  Also, you only need 4 bits per char to  encode  such
expressions.   Going from this "compressed" expression to the message
is computationally trivial and fast, at most a few seconds  for  even
slow Earthly computers. You'd have to use a "bignum" numeric package,
but we've had those for decades, and some are pretty fast.   But  the
compression  stage is (to us mere humans) computationally intractable
at present.  The plot invoved the fact that the  aliens  had  a  fast
algorithm  to  do this compression, so they could send large messages
over low-bitrate channels.

There have been a lot of published schemes that compress  huge  files
to tiny character strings like this.  This is one of many whose known
encoding scheme can be summarized as "brute force",  taking  millenia
or longer with current computing equipment. Someone who comes up with
a rapid way to encode and decode messages using such a  scheme  would
make  quite  a  name  for  themselves,  at  least in mathematical and
computing circles.

Note that generating such expressions is a LOT  more  difficult  than
calculating  prime  factors.  (Or maybe it isn't?  Maybe someone just
made a new mathematical discovery ...) The idea is that there  are  a
lot  of  possible  expressions that produce the message, but most are
larger than the message itself. Thus the message "A" could be encoded
as "2^6+1", but that uses five bytes to encode just one.  You want to
find an expression that's much smaller than the message.

If you're interested in solving unsolved mathematical problems,  this
is a subject area where there is a lot of work to be done.

An unrelated unsolved probblem: How would you build a  search  engine
that could locate things like the above story?  It's not obvious what
keywords one could use, so google et al aren't much help.   The  best
idea  would probably be to use human intelligence, by asking in a sci
fi newsgroup.




More information about the Discuss mailing list