GRASS 4.2 Site Record Format Specification

Site files contain records describing punctual information. Records are limited to files containing only characters from the US-ASCII character set. Records are separated by a newline character (ASCII 0x0a). There are three types of records: comment records, header records, and data records. The formats of each these types of records are described in the following sections.

Site Data Record Description

A site record in the GRASS Sites Format is divided into two parts, each with a different field separator. Part 1 contains location in 2 or more dimensions and part 2 optionally contains attribute information for this location. Both types of fields (and thus site records) are variable length.

Part 1 of a Site Record: Location

Part 1 of a site record gives information about location. The field separator in part 1 of the site record is a "pipe" (ASCII 0x7c) character. The last (non-escaped) pipe signifies the end of part 1 (an escaped character is defined as one prefixed by a "backslash" (ASCII 0x5c)). Any additional fields are considered attribute information.

Each field in part 1 indicates a coordinate in some space. There must be at least two fields in part 1: the first describing a geographic easting and the second describing a geographic northing. These may be in either decimal or degrees-minutes-second format.

Additional fields in part 1 are optional but must be stored in decimal format. They should only be used to represent coordinate information about some space (e.g., elevation, time; depending upon how a space is defined).

Part 2 of a Site Record: Attributes

Part 2 contains attribute information for the location given in part 1. The field separator in part 2 of the site record is a "space" character (ASCII 0x20), except when the space character is contained in double quotes (ASCII 0x22). The three types of attributes are: category, decimal, and string. These attributes may be in any order. Each of these attributes have an associated identifier tag defining the type of attribute in a field: # (ASCII 0x23), % (ASCII 0x25), and @ (ASCII 0x40), for category, decimal, and string, respectively. No space character may immediately follow an identifier tag.

Category Attributes

Categories are a special kind of attribute. They are used to represent vector or raster categories when sites are transformed into these different data formats. There may be only one category field per record and it must be prefixed with a "pound" or "number" symbol (#). Categories must be integers.

Decimal Attributes

Decimal attributes include both integers and floating-point numbers. They are prefixed with a "percent" symbol (%). There may be be zero, one, or more decimal attributes in a site record.

String Attributes

String attributes are fields that contain possibly non-numeric information and are prefixed with the "at" or "each" symbol (@). There may be be zero, one, or more string attributes in a site record. String attributes may contain space (ASCII 0x20) characters if the entire attribute, not including the attribute tag (@), is contained within pairs of "double quotes" ("). String attributes may also contain double quotes if they are escaped by prefixing a "backslash" (\).

Default

If no identifier tag is prefixed (i.e., none of #, %, or @), the type of attribute defaults to string.

Header and Comment Record Format

In addition to the data record format, the site file may contain comment lines (records containing a pound symbol, 0x23, in the first column) and header lines, both of which are optional. Header records must precede all data records while comment records may occur anywhere within a sites data file.

There are five types of header records: (1) name, (2) description, (3) timestamp, (4) label, and (5) format.

name
A name record contains the string "name|" beginning in column 1 and optionally specifies the name of the database file.
description
A description record contains the string "desc|" beginning in column 1 and optionally describes the database file (metadata).
timestamp
A timestamp record is special type of metadata that contains the string "time|" beginning in column 1 and optionally gives a time and date associated with the entire sites file. GRASS timestamps may be a single date/time or a range (begin/end). Valid timestamp strings should be formatted using the routine G_format_timestamp, after creating a valid TimeStamp structure using G_set_timestamp or G_set_timestamp_range. Similar routines exist for reading (see: TimeStamp GISlib functions). The GRASS DateTime utility library may be used to easily and accurately perform DateTime arithmetic. A possible future upgrade would be to specify a particular format identifier tag to indicate a DateTime. Currently, to store a DateTime for each site record, you must specify it as a string and your application must know to expect a DateTime.
label
A label record describes what each dimension and attribute field in site data records represent. It contains the string "labels|" beginning in column 1 and optionally contains field descriptions. No special formatting is required since this record is for user convenience only.
format
A format record describes the format of site data records. It contains the string "form|" beginning in column 1 and a special sample data record beginning in column 6. The special sample data record is a site data record (as describe above) containing only field separators and identifier tags (i.e., all data removed).
All header records are optional. If present in a sites data file, header records must occur in the before any data records in a site file.

Darrell McCauley and Bill Brown
Last modified on