∫ʀeƍueהτʟy
Asκeδ Quesτiʘהs |
|
|
|
|
A
note to the file data sets used here: |
|
|
|
|
|
* red colored testsets on (this page!) contain
already compressed files |
Main Chart |
|
|
|
|
|
|
|
3D Game |
a free open-sourced 3D Racing game
for windows called "TORCS"; downloadable at sourceforge |
|
Remark: None of the archivers
recognized the 75 MB .RGB images as compressible bitmap-like raster images. |
|
|
|
|
|
|
|
|
Bitmaps |
29 bitmaps from Sachin Garg's Public
Testset; 15 greyscale PGM (8 Bit); 14 true color PPM (8 Bit RGB) |
|
http://www.imagecompression.info/test_images/ |
|
[New 2007] |
|
|
|
|
Artificial.pgm |
6.291.473 |
bytes |
|
Big Building.pgm |
39.053.009 |
bytes |
|
Big Tree.pgm |
27.700.417 |
bytes |
|
Bridge.pgm |
11.130.718 |
bytes |
|
Cathedral.pgm |
6.016.017 |
bytes |
|
Deer.pgm |
10.677.580 |
bytes |
|
Fireworks.pgm |
7.375.889 |
bytes |
|
Flower_Foveon.pgm |
3.429.233 |
bytes |
|
HDR.pgm |
6.291.473 |
bytes |
|
Leaves_ISO_200.pgm |
6.016.017 |
bytes |
|
Leaves_ISO_1600.pgm |
6.016.017 |
bytes |
|
Nightshot_ISO_100.pgm |
7.375.889 |
bytes |
|
Nightshot_ISO_1600.pgm |
7.375.889 |
bytes |
|
Spider_Web.pgm |
12.121.105 |
bytes |
|
Zone_Plate.pgm |
6.000.017 |
bytes |
|
Artificial.ppm |
18.874.385 |
bytes |
|
Big Building..ppm |
117.158.993 |
bytes |
|
Big Tree..ppm |
83.101.217 |
bytes |
|
Bridge..ppm |
33.392.120 |
bytes |
|
Cathedral..ppm |
18.048.017 |
bytes |
|
Deer..ppm |
32.032.706 |
bytes |
|
Fireworks..ppm |
22.127.633 |
bytes |
|
Flower_Foveon..ppm |
10.287.665 |
bytes |
|
HDR..ppm |
18.874.385 |
bytes |
|
Leaves_ISO_200..ppm |
18.048.017 |
bytes |
|
Leaves_ISO_1600..ppm |
18.048.017 |
bytes |
|
Nightshot_ISO_100..ppm |
22.127.633 |
bytes |
|
Nightshot_ISO_1600..ppm |
22.127.633 |
bytes |
|
Spider_Web..ppm |
36.363.281 |
bytes |
|
TOTAL |
633.482.445 |
bytes |
|
PhotoJazz 2.0 |
263.249.866 |
|
|
JPEG-LS (R-=G B-=G) |
272.296.987 |
|
|
PackPNM 0.8a |
274.816.911 |
|
|
J2K Lossless |
278.812.478 |
|
|
BMF / BMFG 2.0 ** |
289.265.284 |
converted .PPM to .BMP (uncompressed) |
|
ERI ERI32 5.1 fre (2002) |
305.453.888 |
|
|
BMF / BMFG 1.10 -Q9 -S ** |
309.759.880 |
converted .PPM to .BMP (uncompressed) |
|
HDP HD Photo |
310.841.401 |
|
|
PNG level 9 |
328.328.256 |
|
|
JPEG Lossless |
345.105.024 |
|
|
CoBALP fast * |
crashes |
crashes after confirming save as |
|
|
|
|
|
|
|
|
CD image |
a cd image of an PC Game from 1993
that is free now (some videos, vector-based graphics) and without Copy
Protections. |
|
* This image does not affect copyright laws as
this CD is from March of 1993 |
|
and contains an old and free classical game
only. I am not nor will I ever |
|
be a person that wants to hurt or ignore
copyright laws. This image |
|
was taken only for means of compression
benchmarking. |
|
|
|
|
|
|
CrossPlatform |
a well-chosen freeware collection of
compiled binaries (executables) for different platforms: |
|
Windows XP 32 Bit |
21% |
|
|
OpenSUSE 10.2 x86 64 |
21% |
|
|
Apple MacOS X |
16% |
|
|
BeOS System |
11% |
|
|
Vista x86 64 Bit |
10% |
|
|
SymbianOS 7.x, 9.x |
8% |
|
|
Solaris 10 |
6% |
|
|
PocketPC/Win CE |
4% |
|
|
ZETA OS |
3% |
|
|
PalmOS |
1% |
|
|
|
|
|
|
|
|
|
D.N.A. |
Human Genome Project in text format,
Chromosome 7 of 24 in the chromosome sequence |
|
just a sample of the 400 GB data
mapped by National Center for Biotechnology Information |
|
|
|
|
|
|
|
|
Drivers XP |
the \Windows\Driver
Cache\i386\driver.cab extracted (taken from WinXP Home + SP2), 4653 files;
.exe, .dll, .icm, .ppd … |
|
|
|
|
|
|
|
|
Encyclopedia |
dumps of the modern free XML online enycylopedia
www.wikipedia.org |
|
there are 10 dumps in the world's
most spoken languages: |
|
|
|
|
|
|
arwiki-20090209-pages-articles.xml |
100.000.000 |
bytes in Arabic |
|
dewiki-20090311-pages-articles.xml |
100.000.000 |
bytes in German |
|
enwiki-20090306-pages-articles.xml |
100.000.000 |
bytes in English |
|
eswiki-20090124-pages-articles.xml |
100.000.000 |
bytes in Spanish |
|
frwiki-20090224-pages-articles.xml |
100.000.000 |
bytes in French |
|
hiwiki-20090201-pages-articles.xml |
100.000.000 |
bytes in Hindi |
|
ptwiki-20090128-pages-articles.xml |
100.000.000 |
bytes in Portguese |
|
ruwiki-20081228-pages-articles.xml |
100.000.000 |
bytes in Russian |
|
trwiki-20090207-pages-articles.xml |
100.000.000 |
bytes in Turkish |
|
zhwiki-20090116-pages-articles.xml |
100.000.000 |
bytes in Chinese (Mandarin) |
|
|
|
|
|
|
|
|
FreeDB 2002 |
a CDDB-like and text-oriented
database of nearly all released audio cd's (23th of october in 2002), 2296
files |
|
blues' folder: |
29.898 |
entries |
|
classical folder: |
65.413 |
entries |
|
country folder: |
19.051 |
entries |
|
data folder: |
4.128 |
entries |
|
folk folder: |
42.301 |
entries |
|
jazz folder: |
49.139 |
entries |
|
misc folder: |
227.094 |
entries |
|
newage folder: |
26.369 |
entries |
|
reggae folder: |
8.995 |
entries |
|
rock folder: |
245.832 |
entries |
|
soundtrack folder: |
23.165 |
entries |
|
all folders: |
741.385 |
entries |
|
|
|
|
|
|
|
|
Gutenberg |
a random selection of 409
gutenberg.org ebooks; those documents are in plain text and have both
different languages and |
|
character set encodings, which make
preprocessing difficult and exceeds most compression word-dictionaries (.dic) |
|
|
|
|
|
|
|
|
Installer
Package |
a selection of about 25%
InnoSetup/Nullsoft, 25% InstallShield, 25% Windows Installer/MSI and 25% WISE
Installer/GZIP/ZIP SFX setups |
|
this selection simulates a typical
software installation collection which can be originally downloaded on the
authors websites. |
|
those data is backed-up often in
archives and we want to see which archiver reduces them the most (while being
lossless) |
|
|
|
|
INNO |
everestultimate500.exe |
9.752.064 |
bytes |
INNO |
petst_x64.exe |
12.627.856 |
bytes |
INNO |
stellarium-0.10.2.exe |
42.911.720 |
bytes |
INNO |
XnView-win-full-de.exe |
10.442.205 |
bytes |
IS |
182.08_geforce_winvista_64bit_international_whql.exe |
136.040.624 |
bytes |
IS |
ICQ 6.5.exe |
14.208.072 |
bytes |
NSIS |
gmx_multimessenger.exe |
12.590.864 |
bytes |
NSIS |
kis8.0.0.506de.exe |
43.120.232 |
bytes |
NSIS |
vlc-0.9.8a-win32.exe |
16.320.472 |
bytes |
WIN |
Ad-Aware 2008.exe |
19.153.264 |
bytes |
WIN |
TU2009TrialDE.exe |
17.361.664 |
bytes |
MSI |
POV-Ray for Windows v3.7 beta 31.exe |
12.398.080 |
bytes |
MSI |
QuickTime.msi |
27.953.664 |
bytes |
MSI |
thebat_pro_4-1-11.msi |
16.440.832 |
bytes |
MSI |
UpdateStar_GER.msi |
4.683.264 |
bytes |
MSI |
Virtual_PC_2007_Install.msi |
28.158.976 |
bytes |
MSI |
Windows Live Messenger 2009.msi |
24.961.024 |
bytes |
GZIP |
LINUX ati-driver-installer-9.2-x86.x86_64.run |
83.286.827 |
bytes |
WISE |
copernicagentbasic.exe |
3.546.360 |
bytes |
WISE |
WindowBlinds6_public.exe |
21.975.664 |
bytes |
WISE |
funpix_maker_24mb_d_en.exe |
24.985.888 |
bytes |
ZIP |
Vista Gadgets (8 files) |
2.593.515 |
bytes |
ZIP |
XnView-win-full.zip |
16.613.177 |
bytes |
ZIP SFX |
pdfmachine1218de.exe |
7.012.744 |
bytes |
|
TOTAL |
609.139.052 |
bytes |
|
|
|
|
|
|
|
|
Mobile |
877 files; designed to represent
common user data on digital cameras, smart phones (those with windows mobile
or symbian OS), |
[New 2008] |
MP3Players, USB Flash Drives.. |
|
|
|
It contains those file formats: AAC,
AC3, AMR, GIF, JAR, JPG [EXIF], MP3, MP4, MOV [MotionJPEG], MPG, PDF, PNG,
SIS, SWF |
|
All those files have a compressed
nature and won't compress well unless special lossless recompression is
applied. |
|
So this test set indicates an
archiver's skill to serve as lossless backup solution for data from mobile
devices. |
|
Backing up data from mobile devices
has become more important than personal computer data backup, because not all |
|
devices are safe, can be stolen or
damaged.. |
|
|
|
|
|
|
MOV / MPG |
21% |
|
|
MP3 |
19% |
|
|
AAC |
14% |
|
|
JPG [EXIF] |
13% |
|
|
3GP & MP4 |
13% |
|
|
GIF |
6% |
|
|
JAR & SIS |
6% |
|
|
PDF & SWF |
3% |
|
|
AC3 |
2% |
|
|
AMR |
2% |
|
|
|
|
|
|
|
|
|
Modules |
94 amiga sound files of the 80's
filetypes: .mod, .s3m, .xm; those files had been the music standard decades
ago; |
|
and 24 sound modules from nowadays
trackers (Renoise, MED Soundstudio, Skale Tracker, MO3, Unreal) |
|
|
|
|
|
|
|
|
Nokia |
12.632 monochrome operator logos and
2.926 monotone ringtones for Nokia LogoManager; some 2 color bitmaps included |
|
|
|
|
|
|
|
|
Office files |
a collection of 1.081 files in .doc,
.xls, .htm, .pdf, .mp3, .eml/.msg, .log, .mht, .txt, .log and .jpg formats; |
|
since 2007 it also includes
precompressed files like .wmv, .wma (DRM protected), .ogg, .mp4, .avi(MPEG4),
j2k, .pspimage, |
|
.chm, .exe (UPX-compressed), .flv,
.jar (Java applets), .mpeg, .docx, .xlsx, .swf, .sxf, .pps/.ppt, .psy, .pcx,
.tif, .tga, .gif (and |
|
.gif animations), c4d (3D files from
CineBench 9.5), .dic (PAQ8 dictionaries).. Some hints: Some raster images
included here |
|
contain identical data: a bitmap was
stored in compressed .tif, .pcx, .tga.. format. Microsoft's .mht (Web
Archive) format contains |
|
dozens of JPEGs, but they are MIME
encoded in every .mht file. This testset also contains some visual style
files (such as |
|
WindowBlinds format), but those
images are inside of renamed .zip archives. This testset has the highest
difficulty level !!! |
|
|
|
|
|
|
|
|
Savegames |
1.255 savegame files of 90's games
(XCOM 1+2+3, Keen, Nightmare 3D, Command & Conquer 2, Crystal Caves,
Raptor..) |
|
including savegames of recent games
(Unreal, AOE2, DN3D, Grim Fandango, Half-Life, Heretic 2, Hexen 2, Splinters
Cell..) |
|
|
|
|
|
|
|
|
Sourcecodes |
a mix of C++, C, Pascal, Java and
Basic sourcecodes of 36 Programs icluding OpenSource (sourceforge.net)
projects |
|
like Lazarus, Gimp, 7-Zip,
Stellarium; also contains 9% PNG images. 11.650 files in 681 directories |
|
|
|
|
|
|
|
|
Wavesounds |
contains 16 files in uncompressed
PCM .WAV format (16 Bit @ 44.1 kHz stereo) |
[New 2007] |
all files are created using Poikosoft's Easy CD-DA
Extractor 11.5.3.1 and do not contain any tags |
|
|
|
|
|
4:52 (292sec) |
51.614.684 |
ABBA • The Winner Takes It All (1980) |
|
2:59 (179sec) |
31.620.332 |
Ben E. King • Stand by Me (1961) |
|
2:40 (160sec) |
28.306.988 |
Desmond Dekker • You Can Get It If You Really
Want (1970) |
|
3:55 (235sec) |
41.477.564 |
Enya • Marble Halls (1997) |
|
3:41 (221sec) |
39.034.412 |
Hans Zimmer • Mumm Theme (Commercial) (1989) |
|
4:10 (250sec) |
44.165.900 |
Harajuku • Phantom Of The Opera (Techno Remix)
(1994) |
|
3:33 (213sec) |
37.686.140 |
J.S. Bach • Air, Suite No. 3 in D (1974) |
|
3:32 (212sec) |
37.486.124 |
Jan Hammer • Crockett's Theme (1991) |
|
7:52 (472sec) |
83.385.500 |
Kenny G • Auld Lang Syne (Millennium Mix) |
|
4:20 (260sec) |
45.899.324 |
Laut Sprecher • Herzschlag (2000) |
|
5:56 (356sec) |
62.944.268 |
Leonard Bernstein • One Hand, One Heart |
|
4:18 (258sec) |
45.640.604 |
Queen • I Want to Break Free (1984) |
|
3:48 (228sec) |
40.071.068 |
Roy Orbison • I Drove All Night 1992) |
|
3:45 (225sec) |
39.819.404 |
Sash! • Adelante (Original 7'') (1999) |
|
3:36 (216sec) |
38.246.444 |
Shakira • Pure Intuition |
|
3:19 (199sec) |
35.154.476 |
Traveling Wilburys • Handle With Care (1988) |
|
|
702.553.232 |
TOTAL |
|
OFR 4.600 -max -exp. |
362.098.239 |
|
|
LA 0.4b high |
380.885.483 |
|
|
TAK 1.0.3 -p5m |
388.164.364 |
|
|
Monkey's Audio 4 (insane) |
389.592.620 |
|
|
TAK 1.1.1 -p4m |
392.984.270 |
|
|
WavPack 4.5 -hx6 |
405.005.524 |
|
|
FLAC 1.2.0 option 8 |
418.609.157 |
|
|
TTA 3.4.0 |
419.011.428 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Audio
Testsets |
|
|
|
|
|
|
|
WAV [CD
Quality] |
a 16 Bit stereo 44.1 kHz techno song |
|
|
WAV [Mono] |
a 16 Bit mono 44.1 kHz record of:
Laurel & Hardy - Way Down South |
WAV [Suround
5.1] |
a song with 5 channels in 48 kHz @ 24 Bit |
|
|
WAV [IEEE
Float] |
a 32 bit stereo 44.1 kHz song |
|
|
WAV [SACD] |
a 24 bit stereo 192 kHz song |
|
|
WAV [AudioCD] |
an image of the Best Love Classics
Vol. 4 from 1995 |
|
|
|
|
|
|
|
|
|
Executable
Testsets |
|
|
|
|
|
|
EXE [MS-DOS] |
a file scanner from 1996 |
|
|
EXE [PE32] |
a file scanner from 2002 |
|
|
EXE [.NET] |
a file scanner from 2005 |
|
|
EXE [PE64] |
a benchmark program from 2005 |
|
|
EXE [LINUX
ELF] |
the linux variant of UPX |
|
|
EXE [ARM] |
a pocket PC music player |
|
|
|
|
|
|
|
|
|
|
Image
(BMP) Testsets |
|
|
|
|
|
|
BMP [OCR] |
a true color scanned page of a
children's newspaper |
|
BMP
[Greyscale] |
a greyscale picture of my ancestors |
|
|
BMP [BiLevel
FAX] |
a 2 color profit & loss analysis |
|
|
BMP [DiCOM] |
a greyscale medical image |
|
|
BMP
[Panorama] |
a true color image of Rome |
|
|
BMP
[Landscape] |
a true color image of Sydney |
|
|
|
|
|
|
|
|
|
|
Special
Treatment Testsets |
|
|
|
|
|
|
JPEG |
90% Quality, Scanned Newspaper |
|
|
MPEG |
Videoclip of a german female musician |
|
|
MP3 |
Techno styled song |
|
|
PDF |
a medical encyclopedia |
|
|
CHM |
Paintshop Pro 8 Trial Manual |
|
|
ZIP |
sound & video codec pack |
|
|
GIF |
image of a tiger-colored cat |
|
|
PNG |
image of a tiger-colored cat |
|
|
TIFF |
image of my parents house |
|
|
SIS |
a game for a SymbianOS cellphone |
|
|
SWF |
a collection of 8 flash 7.x games |
|
|
InnoSetup |
a software installation package |
|
|
InstallShield
CAB |
a software installation package |
|
|
InstallShield
MSI |
a software installation package |
|
|
NSIS |
a software installation package |
|
|
MS Compress |
a software installation package |
|
|
WISE |
a software installation package |
|
|
AVI |
an uncompressed movie |
|
|
WMV |
a private webcam video |
|
|
RM |
a private webcam video |
|
|
DIVX |
a private webcam video |
|
|
MOV |
a private webcam video |
|
|
MP4 |
a private webcam video |
|
|
|
|
|
|
|
|
|
|
Language
Testsets |
|
|
|
|
|
|
AFRIKAANS |
www.unboundbible.org |
|
New Testament from 1953 |
ALBANIAN |
www.unboundbible.org |
|
New Testament |
ARABIC |
www.unboundbible.org |
|
New Testament by Smith & Van Dyke |
CHINESE |
www.unboundbible.org |
|
New Testament NCV (Traditional) |
CROATIAN |
www.unboundbible.org |
|
New Testament |
CZECH |
www.unboundbible.org |
|
New Testament BKR |
DANISH |
www.unboundbible.org |
|
New Testament |
DUTCH |
www.unboundbible.org |
|
New Testament Staten Vertaling |
ENGLISH |
www.unboundbible.org |
|
New Testament World English Bible |
ESPERANTO |
www.unboundbible.org |
|
New Testament |
FINNISH |
www.unboundbible.org |
|
New Testament from 1776 |
FRENCH |
www.unboundbible.org |
|
New Testament Darby 1991 |
GERMAN |
www.unboundbible.org |
|
New Testament Luther 1912 |
GREEK |
www.unboundbible.org |
|
New Testament Byzantine/Majority Text 2000
Parsed |
HEBREW |
www.unboundbible.org |
|
Old Testament Westminster Leningrad Codex |
HUNGARIAN |
www.unboundbible.org |
|
New Testament Karoli |
ITALIAN |
www.unboundbible.org |
|
Giovanni Diodati Bible 1649 |
KOREAN |
www.unboundbible.org |
|
New Testament |
LATIN |
www.unboundbible.org |
|
New Testament Vulgata Clementina |
LITHUANIAN |
www.unboundbible.org |
|
New Testament |
MAORI |
www.unboundbible.org |
|
New Testament |
NORWEGIAN |
www.unboundbible.org |
|
Det Norsk Bibelselskap 1930 |
PORTUGUESE |
www.unboundbible.org |
|
Almeida Atualizada |
ROMANIAN |
www.unboundbible.org |
|
Cornilescu |
RUSSIAN |
www.unboundbible.org |
|
Synodal Translation 1876 |
SPANISH |
www.unboundbible.org |
|
Sagradas Escrituras 1569 |
SWEDISH |
www.unboundbible.org |
|
New Testament 1917 |
TAGALOG |
www.unboundbible.org |
|
Ang Dating Biblia 1905 |
THAI |
www.unboundbible.org |
|
King James Version Translated |
TURKISH |
www.unboundbible.org |
|
New Testament |
VIETNAMESE |
www.unboundbible.org |
|
New Testament 1934 |
XHOSA |
www.unboundbible.org |
|
New Testament |
|
|
|
|
|
|
|
|
Earlier
Testsets that are not used anymore |
|
|
|
|
|
|
CALGARY |
well-known test set of 18 files;
souce codes, white papers, library inventory… |
CANTERBURY |
well-known test set of 11 files;
souce codes, epics |
|
DIRECTX GAME |
a critical directx game with DelphiX
sounds and graphics (.dxg), many .jpg & .lbm files; large exe file, 149
files |
MODULES |
94 amiga sound files of the 80's
filetypes: .mod, .s3m, .xm; those files had been the music standard decades
ago |
SAVEGAMES |
784 savegame files of 90's games
(XCOM 1+2+3, Keen, Nightmare 3D, Command & Conquer 2, Crystal Caves,
Raptor..) |
TEXT DATABASE |
an .ini file for MediaPlayer that
includes a text-based database of 244.569 audio cd's (Title, tracklist, play
time …) |
CoDEC's |
a collection of dozens audio &
video codecs in .exe, .dll, .ax, .ocx, .acm, .drv, .cpl, and .gif, .htm, .bmp files; total of 265 files |
NOKIA |
12.632 monochrome operator logos and
2.926 monotone ringtones for Nokia LogoManager; some 2 color bitmaps included |
Bitmap 2007 |
are replaced because they were
created from a lossy source and contained artifacts |
Waveforms 2007 |
are replaced because they were
created from a lossy source and contained artifacts |
Encyclopedia
2001 |
an encyclopedia of the computers
history in german language, some .bmp, .wav, .dll .al7 (dictionary with an
exe header). 78 files |
|
the .al7 file (287 MB) contains
1.390 bitmaps (111 MB) and thousands of text files (176 MB) with full
headers, TAR - alike |
Fonts 2001 |
a total of 114 TrueType fonts and
system fonts included in WinXP; .ttf, .fon; |
|
those binary files are needed to
display text in Windows based environments |
|
|
|
|
|
|
|
|
What
the hell is the sense of compressor benchmarks? |
|
|
|
|
|
|
Testing compression programs has a long history with some
very famous persons having only one idea in mind: To find the best
compression program currently |
available.
Back in 1985 the first compressor was born: SQPC file squeezer. And this
actually evolved a whole compressor population until now. Every
programmer |
claims
his or her program to be the best around - but how do end users distinguish
between the good and the bad compressors? The answer should be: |
Reading
such charts. But this is not the case for most of the world's end users. They
believe in commercials and buy the first one they notice… |
But
when buying a new car every man compares the models around… |
|
It is very very interesting in my opinion that some
compressors released this year do not compress better than those released
decades ago, |
and
on the other hand - it is interesting that a few compressors released some
years ago could compress nearly as well as those released today… |
By running a compression challenge we can activate perhaps
developing improvements or at least could we let know some compression
program writers, |
that actually the old compression codecs are
outdated and should be retired. |
|
It
is rumored, that every couple of years, when the sun shines some beautiful
morning, a genius mind creates a new compression algorithm or improves an
existing… |
|
|
|
|
|
|
|
|
Why this
test? |
|
|
|
|
|
|
|
There
are a few benchmarks out there and the testers already did and still do
brilliant work in testing archivers and publishing results. |
But
those results indicate compression capabilities of small & few files
only. This Squeeze Chart was designed to show the ability of compressors |
in
handling much and large files thus reflecting strength of solid archiving. |
|
|
|
|
|
Why
is one testset not enough? |
|
|
|
|
|
|
Since
the beginning of Personal Computing in the 80's mankind has discovered many
fields on which computers can aid and serve. |
So nowadays we use computers for text
processing, presentations, profit calculations, messaging, gaming, seeking
information, reading books, |
listening
to music, image editing and archiving … |
|
|
And
all this different usage fields have caused different file formats for
storing data. |
So
a file compressor has to understand these different file formats. Mainly,
there are four elements of which most files consists (in plain or melted
form): |
executables, texts, bitmaps and wavetables.
Each element needs to be recognized by archiver and requires a specific
algorithm to be compressed with. |
|
|
|
|
Can
compression still be improved? |
|
|
|
|
|
|
Improvements
mean to work out different algorithms for the 4 elements or to modify
(preprocess) those elements before compression takes place. |
The
LZ-based algorithms are fast, the PPM-based algorithms are slow but do well
on text, the Arithmetic algorithms are the slowest but superior on binary
data. |
The best way is a hybrid algorithm that has
the speed of LZ, text sorting power of PPM and the context comparing
precision of Arithmetic algorithms. |
Actually the first step was done with LZMA
algorithm of Igor Pavlov's 7-Zip that consists of LZ-based speed and
Arithmetics strength.. |
And
there are only a few compressors that can withstand this compression: WinRK,
SLIM and the PAQ6 family. |
Maybe
one day we have algorithms in use that have PPM's text power with
Arithmetic's precision (PPMAri). And for the complex bitmap |
and
wave compression some pixel / sample weavers or replacers that will conquer
the market |
and
convince users that lossy algorithms such as MP3 or JPG are bad since human
memoris will fade - so our digital memories should not loose information,
too. |
|
|
|
|