Help for FASTCLAS
PURPOSE
FASTCLAS is a multispectral classifier using an algorithm which
combines the parallelepiped and Bayesian techniques. Inputs are registered
multispectral data and training statistics from VICAR program STATS. The
inputed multispectral data can be in either separate Vicar Data Sets or
in MSS format (See help on MSS). FASTCLAS differs from an earlier version
in that the program allows prior probabilities to be input in the parameter
field and used in the Bayesian decision rule. The input probabilities may
consists of simple prior probabilities, derived from the expected magnitude
of representaion of classes in the final image. Multiple sets of
probabilities may also be input, with one or two of the data channels serving
not as multispectral values, but as indices to the appropriate set of prior
probabilities. Another feature of FASTCLAS allows the user to reset any
class mean for any band, thus overriding the mean value provided by STATS.
OPERATION
FASTCLAS uses a combination of the parallelepiped algorithm and the
Bayesian maximum likelihood algorithm for classifying multispectral data.
Assume that N spectral bands are available and training statistics from the
Vicar program STATS have been computed. FASTCLAS reads the statistics data
set and generates a look-up table to hold the boundaries in the N-dimensional
decision space for each class. For each dimension (band) the decision
boundary is MU +/- (R * SIGMA), where MU is the mean for the class, SIGMA
the standard deviation and R the number of standard deviations to be used.
This is the parallelepiped algorithm.
To be assigned to class J (DNout = J), a pixel's spectral signature
must fall within the N-dimensional decision boundary for class J. If a
pixel's spectral signature falls outside the decision boundry for all
classes, the pixel is assigned to the unknown class (DNout = 0). A pixel
whose spectral signature falls within the decision boundary for more than
one class is considered ambiguous. The user has the option of resolving
the ambiguity by the Bayesian maximum likelihood algorithm, or leaving the
pixel ambiguous (DNout = 255).
The Bayesian Algorithm performs as follows. First it assumes that
there are N spectral bands, and considers a pixel as an N-dimensional
_
sample vector X. Let K be the covariance matrix computed for the training
__ i
class i and MU the mean of class i. The multi-variate probability P that
_ i i
X is a member of class i is given by:
1 -1/2 _ __ T -1 _ __
P = ------------------------- * e * (X - MU ) (K ) * (X - MU )
i N/2 1/2 i i i i
2PI * |K |
i
where |K | = det(K )
i i
But since we are only interested in the maximum P over all classes,
it is convenient to compute:
_ __ T -1 _ __
Q = Log (P ) = C - 1/2(X - MU ) (K ) * (X - MU ) + Log (PROB )
i e i i i i i e i
where:
C = -1/2 (N * Log (2PI) + Log |K |
i e e i
_
X is then assigned to the class i, for which Q is a maximum.
Thus for each pixel in the scene we assign a class number
corresponding to the class to which the pixel most likely belongs.
If the keyword CHECK is given, the Bayesian confidence value is
computed for each pixel after it is classified. If the pixel's spectral
signature is outside the multivariate confidence interval, the pixel is
reclassified as unknown.
The order in which the spectral bands are input to FASTCLAS will
influence the running time. Since the table look-up portion uses a process
of elimination, bands which give the best spectral separation between classes
should be given first. If the spectral data is in MSS format, the order is
controlled by the USE parameter.
If the parameters PROB and/or PRIOR have been coded, the Bayesian
decision rule uses prior probabilities in the final classification. In the
simplest case, one set of prior probabilities is specified by the PROB
parameter. One probability must be input for each class described in the
STATS file, and the probabilities must sum to 1. The program checks to make
sure the appropriate number of probabilities are input, but does not check
to be sure that they sum to 1.
If the PRIOR parameter has been coded, then the user inputs several
sets of probabilities, and the set used in the decision rule is determined
by the value in the band identified as a prior probability channel. For
example if "PRIOR=(5,3)" is coded, band 5 will be taken as a prior probability
index channel, and will contain only DN values of 0 through 3. Before each
pixel is classified, the 'DN sub 5' value will be checked. If the 'DN sub 5'
equals 1, then the first set of prior probabilities input in the PROB
parameter will be used; if 'DN sub 5' equals 2, the second set of prior
probabilities will be used in the Bayesian decision rule, and so forth. The
program also systematically samples the prior channel image and uses the input
probabilities to calculate, using Bayes' rule, a set of unconditional prior
probabilities which are applied when 'DN sub 5' equals 0.
The PRIOR parameter can also specify two such channels. In this case,
the program expects a set of probabilities for each possible DN value in each
prior probability channel. With those probabilities as input, the program
calculates a full set of prior probabilities (under assumptions of indepen-
dence) which are doubly contingent on the indexes present in the two channels
for each pixel. Zeros may be used freely throughout as index values. When
a zero is encountered, the program assumes no information concerning that
channel is present, and reverts to a separately calculated set of slightly
contingent or uncontingent probabilities as appropriate.
EXAMPLES
1) FASTCLAS INP=(A,B,C,ST) OUT=OUT SIZE=(1,1,500,500) SIGMA=2.5 CSIGMA=(2,3,1.5) 'DONT
This example classifies the multispectral imagery on data sets
A, B, and C according to the training statistics on data set ST. A 2.5
standard deviation confidence interval is used for each input band in
each class with the exception that band 3 of class 2 uses a 1.5 standard
deviation interval. The Bayesian routine is suppressed for resolving
ambiguity.
2) FASTCLAS (MS,ST) OUT MSS=6 USE=(2,3,4,5) SIGMA=3.0 PROB=(0.125,0.137,0.029,0.414,0.295) MEAN=(3,4,144.0,3,5,168.0)
Input data set MS contains 6 specral bands of imagery (MSS format)
with clasification to be performed using only bands 2, 3, 4 & 5. A 3.0
standard deviation confidence interval is used for each band in each class.
Prior probabilities are supplied for each of the classes identified in the
STATS file and will be used in the Bayesian decision rule. (i.e. Class 1
probability is replaced with .125 in all bands, Class 2 is replaced with
0.137 in all bands, etc.) For class 3, the STATS means for bands 4 and 5
are to be reset to 144.0 and 168.0 respectively.
NOTE: a 0 probability for a class doesn't zero it out unless there is
also a 0 CSIGMA for it.
3) FASTCLAS (MSPR2,ST) OUT MSS=5,USE=(4,2,3,1) 'CHECK PRIOR=(5,4) PROB=(1,0.071,0.302,0.207,0.319,0.101, 2,0.271,0.313,0.092,0.107,0.271, 3,0.112,0.419,0.393,0.076,0.000, 4,0.2,0.2,0.2,0.4,0.0)
The input data set consists of five bands in MSS format. Bands
4, 2, 3, and 1 will be used in classification, with the parallelepiped
classifier using them in that order. The multivariate confidence interval
will be checked. Band 5 is a prior probability index channel assuming
DN values 0-4, and four sets of prior probabilities are specified, each
set summing to 1. The fact that five values are given for each set (1)
implies that the STATS file describes exactly five classes and (2)
associates the first probability values with the first class in the STATS
file, the second value with the second class, etc. (i.e. Prob of class 1
occurring in level 1 of band 5 is 0.071, prob of class 2 occuring in level 1
of band 5 is 0.302, etc.)
4) FASTCLAS (MSPR2,ST) OUT MSS=6 USE=(1,2,3,4) SIGMA=3.0 PRIOR=(5,3,6,2) PROB=(5,1,0.017,0.249,0.301,0.433, 5,2,0.321,0.230,0.409,0.040, 5,3,0.519,0.107,0.218,0.156, 6,1,0.213,0.414,0.021,0.352, 6,2,0.107,0.318,0.052,0.477)
The input dataset is six bands in MSS format. Bands 1-4 are
multispectral, band 5 is a prior probability index band with DN values
ranging from 0-3, and band 6 is also a prior probability index band with
DN values ranging from 0-2. Prior probabilities in each set sum to 1, and
the fact that there are four values in each set implies that the STATS
file describes exactly four classes.
TIMING
The runnning of FASTCLAS is a function of the picture size, number
of spectral bands, number of possible classes and size of confidence
intervals desired (SIGMA'S). In addition, running time is data dependent;
that is, it varies depending on the number of times ambiquity must be
resolved. Therefore it is difficult to estimate the running time accurately.
WRITTEN BY: J. D. Addington & A.H. Strahler Oct. 23, 1984
CONVERTED TO VAX BY: Helen De Rueda March 8, 1984
COGNIZANT PROGRAMMER: R. E. Alley
Revisions:
Made Portable for UNIX ... J. Turner (CRI) Jan 02, 1995
PARAMETERS:
INP
input data sets
first, image dataset(s)
last, statistics dataset
OUT
output data set
SIZE
Vicar size field
SL
Starting line of image
SS
Starting sample of image
NL
Number of line in image
NS
Number of samples in image
MSS
Specifies # of bands
BAND
Which bands are stored
USE
which bands are used
SIGMA
Standard deviation
multiplier for boundary
CSIGMA
Standard Deviations multiplier
for Classes
DONT
No Bayesian if ambiguous
CHECK
Check multivariate confidence
PRIOR
Band contains index values
PROB
Denotes probabilities
MEAN
Replaces STATS mean.
See Examples:
Cognizant Programmer: