Help for IBISLSQ9

PURPOSE

   "ibislsq9" performs least squares fits on data in an IBIS interface
(tabular) file.  The solutions and/or residuals can be placed in 
specified columns of the file.  The solutions can also be output
to the terminal.  Multiple fits can be done on different parts of
one file.  A threshold can also be specified so that the program
will repeatedly perform least sqares fitting while throwing out
outliers until all the residuals are under the defined  threshold.  
It first performs a "global" least squares fitting by using all
a grouped dataset.  (Datasets are grouped by the values in CONCOL.)
For all datapoints that have residuals greater than the threshold,
it will obtain 10% of the closest datapoints and perform a "local"
least squares fitting.  If the local residual is still greater than
the threshold then the datapoint is thrown away.  If not, it is
kept.

If more than 1 dependent column is specified, the outlier will
be determined by the distance function

    "sqrt(residual(x1)^2 + residual(x2)^2 + ... + residual(xn)^2)"

EXECUTION

     ibislsq9 INP=DATA.INT OUT=DATAOUT INDCOL=(1,2,3) DEPCOL=4
             CONCOL=7 RESCOL=8  COEFFCOL=(21,22,23) DEPCOL2=5
             RESCOL2=8 COEFFCOL2=(24,25,26) 'NOPRINT THRESH=2.0

     ibislsq9 INP=DATA.INT OUT=DATAOUT INDCOL=(1,2,3)                DEPCOL=(4, 5) CONCOL=7 RESCOL=(8, 9)                COEFFCOL=(21, 22, 23, 24, 25, 26) 'NOPRINT                THRESH=2.0


    This example shows the use of all of the parameters.  The input
file, DATA.INT, is an IBIS interface file and the output file
is also an IBIS interface file.  The data for the independent
variables are in columns (1,2,3), and the data for the dependent 
variables are in column 4 and 5.  The control column 7 is used for
multiple fits to be done in one run.  The data points are grouped by
their control numbers and least square fits are performed by each
control grouped points.  The control numbers DO NOT need to be
grouped together... meaning you can have control number 1.0 then 2.0
then 1.0 and it will still group all the 1.0s together.  If no control
column is specified then one fit is done on the whole file.  

    The COEFFCOL and RESCOL parameters specify in which columns,
of the input file, the results will be put.  If either
or both are not specified then they will not be output.  There must be
as many coefficient columns as there are independent columns and these
match the sequence of the independent variable columns.  In the
example above, coefficient columns 21, 22, and 23 correspond to
dependent dataset in column 4 and coefficient columns 24, 25, and 
26 correspond to dependent dataset in column 5.  The residual column
can be used to easily calculate the deviation of the data points from
the fitted line. Normally the solution for each set is printed to the 
terminal, but this can be turned off with the 'NOPRINT keyword.  

    The length of each set should, of course, be longer than the
number of independent variables (columns).  If it is not then the
least squares fit will not be called and values of -999.0 will be
put out for the solution.  If some columns of the independent data
are dependent then the error MATRIX RANK TOO SMALL be be printed,
and -999.0's will be put out for the solution.  If there is no
solution then zeros will be put out for the residuals.

EXAMPLES

   Suppose that columns 1 and 2 contain points (x,y) in a plane 
   and  column  7 contains a function  f(x,y).   The  following 
   sequence  will perform a quadratic least squares fit  h(x,y) 
   and place the residuals in column 8.

   mf INP=A FUNCTION=("C3=C1*C1","C4=C2*C2","C5=C1*C2","C6=1")
   ibislsq9 INP=A OUT=B INDCOL=(1,2,3,4,5,6) DEPCOL=7 RESCOL=8

   Suppose that you want the program to throw out outliers
   that have a global and local residual of above 2.0:

   ibislsq9 INP=A OUT=B INDCOL=(1,2,3,4,5,6) DEPCOL=7 RESCOL=8
      THRESH=2.0

   Suppose now that columns 1, 2, and 3 represent the independent
   variables, columns 4 and 5 represent x' and y' respectively,
   column 7 represents the control column, 8 and 9 represents
   the residual columns for x' and y' respectively, columns 10,
   11, 12 represents the coefficients to the solution to x',
   and columns 16, 17, 18 represents the coefficients to
   y'.  Suppose also that we would like the solution to have the
   maximum residual for any data point to be below 2.0 and
   we want to not print the solution to the screen.  The
   following command performs this task:

   ibislsq9 inp=a out=b indcol=(1,2,3) depcol=(4,5)        colcol=7 rescol=(8,9) coeffcol=(10,11,12,16,17,18)        thresh=2.0 'NOPRINT

RESTRICTIONS

The maximum number of independent columns (variables) is 20.

Original Programmers: P. Kim, A. L. Zobrist 24 Nov 2008
Current Cognizant Programmer: P. Kim, 24 Nov 2008

REVISIONS
2022-08-10 B. Crocco afids to opensource (untested)


PARAMETERS:


INPS

Input IBIS interface file

OUT

Output IBIS interface file

INDCOL

Independent variable columns

DEPCOL

Dependent variable column

COEFFCOL

Optional columns to place coefficients of the solution

RESCOL

Residuals column

CONCOL

Control column

WGHTCOL

Weight column

NOPRINT

Keyword to suppress printout

THRESH

Threshold value for maximum residual value.

See Examples:


Cognizant Programmer: