All the secrets of compression in GNU / Linux

Compression pipes

We return to the usual problem that becomes an advantage for many advanced GNU / Linux users and it is the large number of alternatives or possibilities available. This for the most inexperienced can be a problem when not knowing well which one to choose, but as I say, having more possibilities or flexibility is never a bad thing, quite the opposite. In this case we will talk about the compression and decompression algorithms and procedures that exist on our favorite platform so that you can see them differently and not as a big mess by not knowing which is the best option in your case ...

The truth is that there are not only tools as used as tar with which we can create packages that can also be added some type of compression as we are used to seeing in the famous tarballs of which we have already spoken in LxA on many occasions. We will also find variants of such trivial and frequent tools such as grep to search inside compressed files such as bzfgrep, or even others such as less and more that also have their variants for compressed files such as bzless and bzmore. To see them all we just have to take a look at the output of the following command:

apropos compress

Algorithms and tests:

Among all algorithms lossless compression available in Linux to compress and decompress data we have a lot of options. To get proof of the time it takes to compress with one or another compression algorithm or how long it takes to decompress it, I suggest you do some tests yourself. You can use the time command for that, which will give you the time that has been taken for the compression and decompression process. For example, if you are going to use the zip tool to compress a file called test:

time zip prueba.zip prueba

That would throw away the time used, but if you want to see the size of the generated fileYou can try to compress the same file with different algorithms and compression tools and once you have all the compressed files in a directory with a simple command to list, check the size of each one:

ls -l

If you prefer, you can also make use of other tools to compare compressed files, for example with some variants of the diff tool:

xzdiff [opciones] fichero1 fichero2

lzdiff [opciones] fichero 1 fichero2

If you want to see graphs on the size and speed of the algorithms, you can visit this other link.

Compression tools:

As for the the tools available we have many of them, some with a graphical interface for newbies and that we will simply have to deal with a simple and intuitive GUI to perform compressions and decompressions such as PeaZip, or 7zip, ... etc. Specifically, the first is capable of working with various formats, specifically more than 180 of them. But if you are one of those who still likes to work with the terminal, then you will have a large number of tools that you surely know:

  • zip and unzip: it is a good option if what you want is files that are portable to other operating systems, since you will find tools to work with these files on Microsoft Windows systems and also on macOS as well as others. For example, to compress a file or directory named test and then decompress it:
zip prueba.zip prueba

unzip prueba.zip

  • gzip: It is the best if what you want is portability simply between Unix / Linux operating systems. Maybe the compression rate is almost identical to zip, maybe slightly better, but you won't find much difference in file size under zip or gzip. To compress and decompress with this tool we can use two options in the case of decompression and they are the -do option, directly using the alias gunzip:
gzip prueba

gzip -d prueba.gz

gunzip prueba.gz

  • bzip2: Like the previous one, this algorithm is very present in Unix / Linux operating systems, although it will take a little longer in the compression and decompression processes than in the case of gzip. In this case, the delay will not translate into a higher compression rate as in the case of xz, since the files compressed under bzip2 will occupy a little more than the gzip ones. That is why it is recommended to avoid bzip2 and opt for xz or gzip instead. Although everything will depend a bit on the type of file you are trying to compress ... For example:
bzip2 prueba

bzip2 -d prueba.bz2

  • xz: It is the preferred format for large file sizes, as it offers the best compression rates, but it is also true that it will take longer to complete a compression or decompression. It is quite newer than the previous ones, so you may find yourself with more primitive distros or old Unix systems that do not have a tool for this one. Examples:
xz prueba

xz -d prueba.xz

  • unrar and rar: We can also work with RAR formats in Linux thanks to these tools, although it is not as popular in the case of * nix systems as the previous ones ... In this case we can choose:
rar a prueba.rar prueba

unrar e prueba.rar

  • compress and uncompress: and although the use of compress is being lost and is not as popular as the previous ones, I would not like to overlook this tool either. It is used to compress files with a .Z extension and does so thanks to a modified Lempel-Ziv algorithm. For instance:
compress -v prueba

uncompress prueba.Z

If you want to work directly with the tar toolYou can also pack and compress the files at the same time as well as unpack and decompress them. In this case we can pass the options of the type of algorithm to use directly to tar. But first of all you should know that with option c we create a package and with option x we ​​extract it. For example:

tar czvf prueba.tar.gz prueba

tar xzvf prueba.tar.gz

As you can see we have used the options zvf which are to indicate the type of compression algorithm z (in this case gzip), v for the verbose mode that gives information about what it is doing, and f to indicate the file to work with ... Well, if we change that z by another letter corresponding to another type of algorithm we can alter the type of compression applied to the tarball:

Option Algorithm Extension
z gzip .tar.gz
j bzip2 .tar.bz2
J xz .tar.xz
lzip zip .tar.lz
lzma lzma .tar.lzma

* Of course all the previous commands have interesting options that I invite you to discover using man, some very necessary such as recursion, etc.

Do not forget leave your comments...


Leave a Comment

Your email address will not be published. Required fields are marked with *

*

*

  1. Responsible for the data: AB Internet Networks 2008 SL
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.

  1.   Javier Martinez Echenique said

    I in particular use 7zip

  2.   Marcelo said

    You missed the 7zip. A very good option and FREE SOFTWARE.

  3.   Umberto said

    Excellent information, although I would have started by saying that it can also be compressed and decompressed graphically without any problem so that you do not see the "hornet" on duty that says that GNU / Linux is very difficult and everything has to be done on the console. NO, IT'S ANOTHER OPTION.