A file compressor is great for shrinking stored files, but it depresses me whenever I see a file grow instead of shrink. So what I am looking for is a file compression algorithm that never inflates any files, although it is allowed that some files (not all of course!) have the same length after "compression". Ideally it should work on files of all sizes, but I would be satisfied with a compressor that operates only on files larger than 1MB.
Can you provide such an algorithm? No programming knowledge is required for this problem.
(In reply to
re: Ideas by Larry)
Of course you could encode any repeating pattern you want. We just want an algorithm that guarantees the file does not grow, not the best one. I tried to keep it simple, clearly it can be improved.
As for encoding the position, I think the compressor does need to tell the decompressor where to re-insert the long strings. Or at least it does as I envision it.
Here's a 1000 bit file with a string of 25 zeroes beginning at position 501:
[500 digits of random looking data][0000000000000000000000000][475 digits of random looking data]
Compressed:
[500 digits of random looking data][475 digits of random looking data]#[code indicating where and what to reinsert]
The code indicating what and where might be something like:
0 011001 0111110101 (0 25 501)
The # would be a code indicating the end of the file and the beginning of the compression info. The minimum string size that would be compressed could be modified as needed to account for the length of this code.
|
Posted by Jer
on 2006-10-19 11:26:18 |