During compression, this program reformats raster files using a run-length-encoding (RLE) algorithm. Raster map layers which contain very little information (such as boundary, geology, soils and land use maps) can be greatly reduced in size. Some raster map layers are shrunk to roughly 1% of their original sizes. Raster map layers containing complex images such as elevation and photo or satellite images may increase slightly in size. GRASS uses a new compressed format, and all new raster files are now automatically stored in compressed form (see FORMATS below). GRASS programs can read both compressed and regular (uncompressed) file formats. This allows the use of whichever raster data format consumes less space.
As an example, the Spearfish data base raster map layer owner was originally a size of 26600 bytes. After it was compressed, the raster file became only 1249 bytes (25351 bytes smaller).
Raster files may be decompressed to return them to their original format, using the -u option of r.compress. If r.compress is asked to compress a raster file which is already compressed (or to decompress an already decompressed file), it simply informs the user of this and asks the user if he wishes to perform the reverse operation.
If the user simply types r.compress without specifying any map layer name(s) on the command line, r.compress will prompt the user for the names of the map layers to be compressed/decompressed, and ask whether these maps are to be compressed or decompressed. This program interface is the standard GRASS parser interface.
The decompressed raster file format matches the conceptual format. For example, a raster file with 1 byte cells that is 100 rows with 200 cells per row, consists of 20,000 bytes. Running the UNIX command ls -l on this file will show a size of 20,000. If the cells were 2 byte cells, the file would require 40,000 bytes. The map layer category values start with the upper left corner cell followed by the other cells along the northern boundary. The byte following the last byte of that first row is the first cell of the second row of category values (moving from left to right). There are no end-of-row markers or other syncing codes in the raster file. A cell header file (cellhd) is used to define how this string of bytes is broken up into rows of category values.
The compressed format is not so simple, but is quite elegant in its design. It not only requires less disk space to store the raster data, but often can result in faster execution of graphic and analysis programs since there is less disk I/O. There are two compressed formats: the pre-version 3.0 format (which GRASS programs can read but no longer produce), and the version 3.0 format (which is automatically used when new raster map layers are created).
Address array (long) - array (size of the number of rows + 1) of addresses pointing to the internal start of each row. Because each row may be a different size, this array is necessary to provide a mapping of the data.
Row by row, beginning at the northern edge of the data, a series of byte groups describes the data. The number of bytes in each group is the number of bytes per cell plus one. The first byte of each group gives a count (up to 255) of the number of cells that contain the category values given by the remaining bytes of the group.
The address array is the same.
The RLE format is the same as the pre-3.0 RLE, except that each row of data is preceded by a single byte containing the number of bytes per cell for the row, and if run-length-encoding the row would not require less space than non-run-length-encoding, then the row is not encoded.
These improvements give better compression than the pre-3.0 format in 99% of the raster data layers. The kinds of raster data layers which get bigger are those in which each row would be larger if compressed (e.g., imagery band files). But even in this case the raster data layer would only be larger by the size of the address array and the single byte preceding each row.