All about flooble | fun stuff | Get a free chatterbox | Free JavaScript | Avatars    
perplexus dot info

Home > Algorithms
Shrink needed (Posted on 2006-10-18) Difficulty: 2 of 5
A file compressor is great for shrinking stored files, but it depresses me whenever I see a file grow instead of shrink. So what I am looking for is a file compression algorithm that never inflates any files, although it is allowed that some files (not all of course!) have the same length after "compression". Ideally it should work on files of all sizes, but I would be satisfied with a compressor that operates only on files larger than 1MB. Can you provide such an algorithm? No programming knowledge is required for this problem.

See The Solution Submitted by JLo    
Rating: 3.6667 (6 votes)

Comments: ( Back to comment list | You must be logged in to post comments.)
Review | Comment 10 of 26 |

Data algorithms can be quite simple or quite sophisticated.

Said all too simply the process is to map data in a manner that a program (zipper) saves a file from memory and the unzipper reconstitutes the file.

All too simplistically, my "Lightly Squeeze Me" comment does just that, but only takes out spaces.

Jer (and also see) wants to address arbitrary sequences that he might find duplicated; I doubt that the technologies look as deeply at the 'bit' level that he is envisaging.  And yet this thought process has possibilities.

I was looking at my Oxford English Dictionary - copy its entry words to the zipper program as a reference source. 

I could put 65536 (2^16) words as a reference base.  This would cost me 3 file bytes, one would have to be a control character (zipper says "Unzipper, I've got a signpost, go there and overwrite contents for these two bytes") and the two bytes which will nominate my word.

Good! But NO! 1989 2nd Edition OED had 615,000+ entries and was 20 volumes!  And if I have to contend with typos, misspellings, foreign and technical words?

Even going to 2^32 (4 byte addresses) doesn't alleviate those problems; and whose going to update anyway?

And yet a 4 byte address system might help with image compression.  My scanner will copy "millions of colours" which I think means 3 colour channels, Red [0-255], Blue[0-255] and Green [0-255].  What are the logistics of working through this?

I do have a succinct answer for the text part, but     [I'm deliberately avoiding referential material at the moment].

There are some interesting technologies for those who wish to explore.

Jlo?  Is this too deep? Or should we open a debate within a forum? 

Edited on October 20, 2006, 9:26 pm
  Posted by brianjn on 2006-10-20 04:17:38

Please log in:
Login:
Password:
Remember me:
Sign up! | Forgot password


Search:
Search body:
Forums (0)
Newest Problems
Random Problem
FAQ | About This Site
Site Statistics
New Comments (16)
Unsolved Problems
Top Rated Problems
This month's top
Most Commented On

Chatterbox:
Copyright © 2002 - 2024 by Animus Pactum Consulting. All rights reserved. Privacy Information