[name_field=name] [zone=value] [spheroid=name]
DESCRIPTION
This program reads a previously selected subset of STF1 or
PL94-171 U.S. Census Bureau demographic records (see m.in.stf1.tape),
and writes one of the following:
- GRASS site list (coordinates and a value).
- GRASS vector polygon attribute labels file (coordinates and a label value).
- Text report to stdout in one of two formats.
Any arithmetic expression combining constants with any of the more than
900 hundred demographic (numeric) fields can be defined and
will be evaluated (by the Unix bc -l utility) to compute the
value for each input record.
The location coordinates for the polygon label point or the site location
is obtained from certain columns (269-287) in the input records,
making this program quite specific for the specified types of STF1 and
PL94-171 input file records. Use with other types of input data is not
advised.
OPTIONS
Flags:
- -d
- Output IDENTIFICATION SECTION to stdout (20 pages)
- -p
- Output STF1 MATRIX TABLE to stdout (30 pages)
- -l
- Output PL94-171 MATRIX TABLE to stdout (1 page)
- -u
- Output STF1 SUMLEV TABLE to stdout (4 pages)
- Note:
- Only the first flag given will be executed; the program exits after
sending one table to stdout. Other parameters are ignored.
Parameters:
- input_stf1=name
- Path/name of STF1 or PL94 input file
- out=name
- Type of output:
site | atts | table:Lxxx | - (stdout)
default: -
- att
- for results to vector map attribute file
- site
- for results to site list
- -
- for results to stdout; this is the default
- table
- for results in table form to stdout with
the column value(s) represented by the ':Lxxx:Lxxx' string
preceeding the expression value instead of easting and northing.
Lxxx is in same choice of representation as column designation
in formula (see below).
- ef=name
- Path/name of text file with formula expression(s)
- formula=map=expression
- mapname=Computation with STF1A columns, constants and bc operators
(see below). mapname of vector or site map to make is required
in all cases (but ignored for '-' or 'table:' output modes).
- name_field=name
- field name for parsing stdin lines (-a to ignore)
default: -a
name is Keyword in stdin stream preceeding the number of a STF1 record
to read from input_stf1 file. The number is a physical sequence
number, not a record identification number. This parameter is
appropriate only in special uses of this program. If '-a' is
given, or this parameter is omitted, all records in STF.file will
be processed.
- zone=value
- UTM zone number; default is location zone
options: 1-60
default: 10
- spheroid=name
- Spheroid for LL to UTM conversion; see m.gc.ll
default: clark66
NOTE: Only one of the "ef" or "formula" input fields may be used.
THE "formula" PARAMETER
The string after the '=' in this parameter almost always must
be enclosed in quotes to protect it from Shell interpretation of characters
such as * / ( ) and spaces (which may be used to increase readability,
but are not necessary). This string is always of the form
mapname=expression.
The mapname string can be any legal GRASS map name (up to 14 characters).
The second '=' is required and may be preceeded and followed by a space.
In the expression, great flexibility is provided for the
computation of interesting combinations of the data fields contained
in the demographic records.
The usual operators used in the expression include: +, -, *, /, and %.
The user should review the Unix manual entry for the bc calculator
for the complete list of available functions and other operators.
Parentheses are used in the normal algebraic manner.
The operands in the expression may consist of any mixture of:
integer constants
real constants
functions allowed by bc
numeric fields from the demographic records
Numeric fields from both the IDENTIFICATION and MATRIX SECTIONS of the
demographic records may be used. Review the User and Technical Documentation
for these demographic files; or use the -p option of this program
to print the MATRIX SECTION document (the demographic data fields).
When substituting values from the numeric fields into
the expression, <SPACE> characters are replaced by zero. (Spaces, which
are rare in the demographic data, are usually the result of missing
values or restricted information).
The numeric fields from the demographic records may be designated in
one of three ways in the expression.
- 'Lcccc', where 'L' is an upper case alphabetic letter which indicates the
length of the numeric field in the data records: A=1, B=2, ... , I=9,
J=10, O=15, etc; and 'cccc' is the starting column number for the data field
of interest: 301 for 100% population count, 7221 for total number
of four-room housing units, 58 for 101st Congressional District, etc.
The proper specification of the Congressional District number would
be 'B58' because it is a two-column field. 'I301' would indicate that
the 100% poplation should be used in the calculation; it's a nine-column
field.
- 'Pnna0nnn' or 'Hnna0nnn', where 'n' is a digit and 'a' is a digit or
(rarely) upper case letter. These forms represent the MATRIX field naming
schemes used in the CD-ROM "dBase 3" files. They can be used in
processing STF1 records extracted from the CD-ROM or TAPE distribution
media. All eight characters of the field name must be used. (Note:
this form cannot be used in processing the PL94-171 records.)
- 'ID_NAME'. The STF1 and PL94-171 files use the same set of
IDENTIFICATION SECTION field names (67 fields) for the "locational"
information contained in the first 300 characters of each record. The
field name, in all upper case letters, may be used if the numeric value
of the field is needed in the expression. Two of the most useful
fields are AREALAND and AREAWAT which allow the direct computation of
population density values.
The bc calculator usually returns a value with a very large number
of decimal places. Vector attributes are automatically truncated(!) to
integers by the
v.support
program when the topology file is built. Site
list descriptions are likewise truncated to integers by GRASS programs
which use the descriptions as numeric values (e.g., sites to cell in
s.menu
).
THE 'out=table:' PARAMETER
The default operation of v.apply.census when tabular reports are
produced to stdout (when not making a sites list or vector attribute
file) is to print the easting and northing coordinates and then the
value resulting from the expression. Often the user wishes to
have an identifier different from the coordinates. The construction
out=table:field will replace the coordinates with the character string
indicated by field, which may have any of the three
forms used for numeric fields in the formula expression, see
above. Note: a special case exists for the 66 character description
field which begins in column 192; the entire field will be printed if
designated by either of these two forms: out=table:ANPSADPI or
out=table:Z192.
Complex tables may be produced by making multiple runs of
v.apply.census, redirecting the tabular output to files, and using the
Unix tools cut, paste, and colrm to combine the resulting files.
EXAMPLES
[These examples assume that STF1 records for the census tracts
(SUMLEV=140) in a particular county (CNTY=037) have been extracted from
the distribution source (with m.in.stf1.tpe)
and saved in a file named 037.140 in the current directory.]
- Create a site list where the label values are the percentage of
females in each input record:
- v.apply.census in=037.140 out=site formula='pct.female = 100 * (I373 / I301)'
- Label an existing vector map (named tract.pop) of tract boundaries
with the total population of each tract (run
v.support
and build
topology after creating the attributes file):
- v.apply.census in=037.140 out=atts formula=tract.pop=P0010001 (or)
- v.apply.census in=037.140 out=atts formula=tract.pop=I301
- Produce a tabular report of the census tract numbers and the number
of Hispanics per square kilometer:
- v.apply.census in=037.140 out=table:TRACTBNA formula="m=(P0080001/AREALAND)*1000"
- Produce a tabular report of the number of people per housing unit
for each tract and the coordinates of the internal (label) point:
- v.apply.census in=037.140 formula="map=P0010001/H0010001"
FORMAT OF THE FORMULA TEXT FILE
Running this program with the ef=file command line parameter causes
the named file to be read and each formula contained therein to be processed
as if it were entered on the command line. The out= parameter
may optionally be respecified in this file; each out= selection
remains in effect until explicitly changed. The program exits after
the last formula is processed.
The following is a sample formula file. Note the use of lines beginning
with '#' as manditory formula separators and comment lines. Also note
that expressions may be continued on successive lines; the lines are
concatenated to a maximum of 500 characters for a single formula. Blank
lines are ignored.
popden.sqkm = 1000 * P0010001 / (AREALAND+AREAWAT)
# that computes population density in people per sq. km.
#
# next do people per sq. mile as a vector attribute file
out=att
popden.sqmi = 2.59*1000 *
P0010001 / (AREALAND+
AREAWAT)
#
# next do total population as a vector attribute file
total.pop = I301
#
# output the county identification numbers as the descriptions
# in a site list
out=sites county = CNTY
#
# output the 66 char. description and FIPS Place Code as a table
out=table:ANPSADPI
map=PLACEFP
# optional ending comment line
BUGS/FEATURES
Computational errors in bc are not handled too gracefully: a warning is
output and a zero result is used.
bc tends to output lots of decimal places. The user must clean this
up for output sent to stdout.
The GRASS site list output format used includes the '#' before the
label value to facilitate the production of raster maps with cell
values representing the results of the formula execution.
If using the "name_field" and "ef" parameters, the formula file may
contain only one formula.
SEE ALSO
m.in.stf1.tape is used as a preprocessor to select subsets of
STF1 or PL-171 tape format records for input to this program.
Unix utility programs
such at grep or awk can also be used to select subsets of
lines from the PL94-171 files, but not from the STF1 tape files due to
their very long record lengths.
AUTHOR
Dr. James Hinthorne, GIS Laboratory, Central Washington University.
July, 1992.