Help for IBISREGR
PURPOSE
"ibisregr" performs a series of multiple linear regression analyses on IBIS
tabular files, searching for a best fit. This can be used, for example,
to find the best combination of spectral channels for determination of a
physical parameter.
METHOD
The program loops through all possible combinations of independent variables
(columns) that are allowed by the user parameters, performing a linear
least-squares fit ("regression analysis") of the dependent variable for
each case. The NBEST best solutions are retained and are printed to the
terminal and optionally written to the output file.
The criterion for the best fit is that the R-squared statistic be maximized,
or the standard error be minimized, or both. The R-squared statistic is the
fraction of the total variance in the dependent data that is explained by
the regression. The standard error of the estimate is the RMS average of
the residuals (the misfit between the predicted and actual dependent data).
The optional output file contains three columns for each of the NBEST best
fits: the first shows the M input columns used for this solution (the
first entry in this column is in row 2, and row 1 always contains a zero,
denoting the constant term), the second column contains the M+1 regression
coefficients (the first being the constant), and the third contains the
residuals.
NOTE: The multiple regression technique assumes that the residuals are
uncorrelated and come from a normal distribution.
EXAMPLE
ibisregr DATA.TAB REGR.TAB COLS=(1,2,5,7,8) COLRANGE=(3,4) DEPCOL=10 NBEST=4 'STDERR 'PRINT
This case will search through all possible combinations of three and four of
the five specified columns (columns 1, 2, 5, 7, and 8) in the input file
DATA.TAB. The dependent variable is in column 10. The best four solutions
will be retained and written to the file REGR.TAB, using the standard error
as the criterion. The solution for each combination will be printed to the
terminal.
RESTRICTIONS
The maximum number of input columns is 40.
The maximum column length is 500.
The maximum amount of data (number of columns times column length) is 10000.
The maximum number of solutions that can be retained is 20.
WRITTEN BY: L.W.Kamp, July 1987
(based on program "ibisstat")
REVISIONS:
JAN 2 1995 AS (CRI) Made portable for UNIX
PARAMETERS:
INP
Input IBIS tabular file
OUT
An output IBIS tabular file
DEPCOL
Dependent variable column
COLS
Columns to search through
COLRANGE
(Minimum,maximum) number of
combinations of columns
NBEST
Number of solutions to retain
DEPNAME
Heading for the dependent
variable.
COLNAMES
Headings for the columns
CRITERIA
Criteria for best solution
PRINT
Print output for each
iteration?
See Examples:
Cognizant Programmer: