Help for USTATS
Purpose: USTATS is a VICAR applications program that performs an
unsupervised clustering algorithm upon multispectral data. The output
is a statistics data set compatible with the program FASTCLAS.
Operation: A sampling of pixels is chosen, determined by the INC, LINC
or SINC keyword parameters. The first sampled pixel is set as the first
cluster. For each of the remaining pixels to be sampled, the following
operations are performed:
1) The Euclidean distance from the mean of each cluster is
computed. The Euclidean distance is defined as:
(E.D.)**2 = SUM OVER ALL BANDS of [DN{mean} - DN{pixel}]**2
2) If the Euclidean distance to each of the existing clusters
is greater than the value specified by the INITIAL parameter,
a new cluster if formed by this pixel. Otherwise, the pixel
is added to the nearest cluster, and that cluster's mean for
each band is recomputed.
3) The neighboring pixel to the left is then checked to see
whether it can be grouped into the same cluster. If its
Euclidean distance is not greater than the INITIAL parameter,
it too is added to the cluster, and the means recomputed.
This process is repeated until a pixel is found that cannot
Be added to the cluster.
4) The pixel(s) to the right is (are) checked in the same manner
as in Step 3.
If the NONN parameter has been specified, Steps 3 & 4 are omitted. If,
at some point, this process generates more clusters than have been
specified in the CLUSTER parameter, the message, 'SAMPLING INCOMPLETE AT
LINE n' will be printed. No more pixels will be sampled, but processing
will continue.
When the sampling process is complete, the clusters that have been
formed are examined. Clusters containing only one pixel are removed.
Standard deviations for each band in each cluster are calculated and,
if the one-standard-deviation regions of two clusters overlap, they
are merged into one cluster. The remaining clusters are sorted by
population.
The number of clusters to be retained as classes for output is determined
by the CLASSES and PERCENT parameters. If either of these parameters
is specified, the default of 10 clases is overridden. If both parameters
are specified, both conditions must be met to be included as an output
class.
The output statistics data set is of the same format as the output data
set from STATS, and is suitable for input into FASTCLAS. The
only difference is that USTATS does not compute the off-diagonal
elements of the correlation matrix, but sets them to zero.
Restrictions:
1) Image size is internally restricted to 32000 samples.
2) 12 input data sets
Examples:
1) USTATS (A,B,C,D) ST INC=10 INITIAL=8.0 CLUSTERS=300 EXCLUDE=0 CLASSES=15
In this example every tenth sample of every tenth line is sampled.
The initial clusters have an 8.0 DN radius and up to 300 clusters
may be formed. Pixels of 0 DN are ignored. The 15 most populous
clusters are output.
2) USTATS MS ST (1,1,500,1000) MSS=6 USE=(1,2,4,5,6) SINC=5 PERCENT=1.0
In this example the input is in MSS format and contains 6 bands,
but the third band is not to be used. Every fifth sample of every
twentieth line (default) is sampled. Those clusters that are at
least 1% of all pixels sampled are retained for output.
3) USTATS MS ST MSS=4 'NONN
In this example, there are 4 MSS bands, all are to be used, and
nearest neighbors are not to be sampled.
HISTORY
Written by: Ron Alley, March 31, 1978
Cognizant Programmer: Ray Bambery
19 OCT 79 ...REA... INITIAL RELEASE
29 AUG 85 ...JHR... CONVERT TO VICAR2
5 SEP 94 ...CRS (CRI) REVISE FOR PORTING
10 JUL 95 ...VRU (CRI) CHANGED FIRST OUTPUT FILE FORMAT TO ISTATFILE
15 APR 98 ...RRP (AR-9900) UPDATED USTATS.PDF TO RESTRICT CERTAIN
PARAMETERS TO BE LESS THEN OR EQUAL TO ZERO.
16 JUL 2011 ...RJB... Clean up code to prevent warning messages
with gfortran 4.4.4 compiler under Linux
Remove MSS actions, MSS and USE parms.
Convert to HALF and BYTE data set operations
Fix a wide range of logic and coding errors.
The biggest problem was incorrect variance
computation which was not dividing by number of pts.
21 Jul 2011 ...RJB... Add complete covariance matrix, not just diagonal.
Debugging code and comments need to be removed.
This version needed in quick turn around.
PARAMETERS:
INP
STRING - Input data sets.
OUT
STRING - Output data set.
SIZE
INTEGER - Standard VICAR size field.
INC
INTEGER - Initial cluster increment.
LINC
INTEGER - Initial cluster line increment.
SINC
INTEGER - Initial cluster sample increment.
INITIAL
REAL - Radius or inital clusters.
CLUSTERS
INTEGER - Maximum number of clusters.
EXCLUDE
INTEGER - Exclude DN value from sampling.
NONN
STRING - No nearest Neighbors.
CLASSES
INTEGER - Keep N most populous classes.
PERCENT
REAL - Keep classes with X% or greater of all pixels sampled.
NOPRINT
STRING - Do not print populations & means.
!.VARIABLE MSS
!INTEGER - Specifies number of bands in MSS format.
!.VARIABLE USE
!INTEGER - Denotes which MSS bands to use.
NOTIFY
STRING - Displays progress of program.
ALL
STRING - Skips code which combines & eliminates clusters.
See Examples:
Cognizant Programmer: