Level 2 Help for MF4

INP

Specifies IBIS interface file. There is no output file. Results of MF3 are written in INP.

FUNCTION

     FUNCTION            
	 
MF3   allows   the  user  to  create  FORTRAN or C -like 
expressions to perform general mathematical operations on 
one  or more IBIS/graphics file columns.   The  expressions 
are  written as a parameter string.   The parameter  is 
interpreted  to determine the input and output  columns 
and   operations  to  be  performed.    Applies a user 
specified arithmetic expression to columns of a cagis table.
All results are computed in double precision (15 decimal
places) even if the input columns are single precision or
integer.

The functions available are: @sqrt, @alog, @alog10,
@aint, @sin, @cos, @tan, @asin, @acos, @atan,
@atan2, @abs, @min, @max, @mod, along with
standard binary operations +,- *, / and **, and
logic operations <, >, <=, >=, ==, !=, &&, ||,
^, and !.

String functions are also available.  The arguments
can be column names (must contain strings) or string
constants enclosed in single quotes, except for some arguments
which are numeric (e.g., see fstr below).  Examples:
	      @cat(a,b)  or @cat(a,'xxx')

The string functions are:

@cat(a,b)                concatenates a to b

@break(a,b)              outputs a up to first occurrence of a
			 character in b   (e.g., @break(a,'.,;:?'))

@fstr(a,m)               outputs the first m characters of a

@bstr(a,m)               outputs from the m'th character to the end of a

@adelete(a,b)            deletes any of b's characters from a
			 (e.g., @adelete(a,'.,;:?'))

@sdelete(a,b)            deletes occurrences of the whole string b from a
			 (e.g., @sdelete(a,'dog')

@trim(a,b)               trims from the low order end of a, all characters
			 in b, but stops trimming at the first non-b char

@ucase(a)                outputs a in upper case

@lcase(a)                outputs a in lower case

@ljust(a,n)              left justifies a in an n-character field.  if too
			 long, keeps high order part of a

@rjust(a,n)              right justifies a in an n-character field.  if too
			 long, keeps high order part of a

@replace(a,'dog=cat')    replaces all occurrences of the string before the
			 = with the string after the =

@strlen(a)               outputs the length of the string a

@pos(a,b)                finds the pattern b in a and returns its starting
			 position.  ^ is left anchor % is right anchor
			 ? matches any single character * matches a run
			 (e.g., @pos(a,'^a??.*z*%')).  There is a limit of 
			 three * in a pattern.  Also, patterns with a *
			 will find the shortest match, not the first match.

@streq(a,b)              returns TRUE or 1 if a equals b else FALSE or 0

@strsub(a,b)             returns TRUE or 1 if a contains b else FALSE or 0

@strpat(a,b)             returns TRUE or 1 if a contains the pattern b
			 else FALSE or 0.  see the syntax for @pos(a,b)

@num(a)                  returns the numeric value of string a, which must
			 contain an integer or floating number, can have exponent
			 such as 2.73e-06 (use e, E, d, or D).

@i2str(n)                converts the integer n to a string; zero goes to 0

@f2str(f,n)              converts the float or integer f to a floating
			 string with n digits of precision to the right of
			 the decimal; n=0 omits the decimal; rounding is
			 performed

@dmsstr(a)               converts the string degree-min-second into a
                         degree number.   Acceptable formats include
                         1332727.666, 1332727.666W, 1.332727E+06W,
                         133d27m27.666 where the - can be any non-numeric
                         separator other than .+-Ee.  The EWNS can be
                         lower case and can be at the front e1332727.666.
                         A minor point, exponent e or E can be followed
                         by a number or a sign, but d or D must be
                         followed by a sign.

@dmsnum(f)               converts the number degree-min-second into a
                         degree number.   Acceptable formats include
                         1332727.666, -1332727 (real or integer).

See the test pdf for more examples of the use of strings,
excerpted here:

!ibis-gen xx1 nr=1 nc=3 format=("A10","A12","DOUB","DOUB")  !      data=(0.00001) datacols=(3)  !      string=("aabbccddee") strcols=(1)

! use any one line of the next bunch

!mf4 xx1 f="c2='b'"
!mf4 xx1 f="c2=@trim(c1,'e')"
!mf4 xx1 f="c2=@break(c1,'db')"
!mf4 xx1 f="c2=@fstr(c1,7)"
!mf4 xx1 f="c2=@bstr(c1,7)"
!mf4 xx1 f="c2=@adelete(c1,'bd')"
!mf4 xx1 f="c2=@sdelete(c1,'bb')"
!mf4 xx1 f="c2=@sdelete(c1,'bc')"   
!mf4 xx1 f="c2=@replace(c1,'bb=qqq')"
!mf4 xx1 f="c2=@replace(c1,'bc=qqq')"
!mf4 xx1 f="c3=@strlen(c1)"
!mf4 xx1 f="c3=@strlen('abc')"
!mf4 xx1 f="c3=@streq(c1,'aabbccddee')"
!mf4 xx1 f="c3=@streq(c1,'aabbccddeef')"
!mf4 xx1 f="c3=@strsub(c1,'bb')"
!mf4 xx1 f="c3=@strsub(c1,'bc')"
!mf4 xx1 f="c2=@ljust(c1,12)"
!mf4 xx1 f="c2=@rjust(c1,12)"
!mf4 xx1 f="c3=@num('23456')"
!mf4 xx1 f="c3=@num('23456.7890123')"
!mf4 xx1 f="c2=@i2str(1234567890)"
!mf4 xx1 f="c2=@i2str(-75)"
!mf4 xx1 f="c2=@f2str(47.55555,2)"
!mf4 xx1 f="c3=@pos(c1,'bb')"
!mf4 xx1 f="c3=@pos(c1,'bc')"
!mf4 xx1 f="c3=@pos(c1,'^a')"
!mf4 xx1 f="c3=@pos(c1,'b*d')"
!mf4 xx1 f="c3=@pos(c1,'e%')"
!mf4 xx1 f="c3=@pos(c1,'b?c')"
!mf4 xx1 f="c2=@strpat(c1,'bbc')"
!mf4 xx1 f="c2=@strpat(c1,'^a')"
!mf4 xx1 f="c2=@strpat(c1,'b*d')"
!mf4 xx1 f="c2=@strpat(c1,'e%')"
!mf4 xx1 f="c2=@strpat(c1,'b?c')"

!ibis-list xx1 csiz=(16,16,16,16) cfor="%16s %16s %16.12f %16.12f"


All operations work as in the c language with
except for the column operations described below
and the ^ operator noted above.
The special variable @index may be used to insert
the record number into an expression.  The special
variable @rand may be used to put a random number
between 0 and 1 in the column.  If @rand is used, the
parameter seed can be used to vary the random sequence
Multiple formulas may be given by separating them with the
$ character.

Column operations are added features that perform
specialized functions to the table.  Two restrictions
must be observed:

1. Column operations cannot be used in a formula.
2. The arguments must be column names, not constants
   or expressions.

They perform an operation on columns placing
results in a column.  The operations @fill and
@interp require a column of values separated by
zeros.

The column operations are:

@cavg(col1,col2)        Replace the values in col1 with
                        the average using col2 as control.
                                                                                                                                                       
@csigma(col1,col2)      Replacd the values in col1 with
                        the standard deviation of the
                        values using col2 as control.
                                                                                                                                                       
@average(col)           calculates the average in
            			the column

@sigma(col)             calculates the standard deviation in
		            	the column

@sum(col)               sum the values in the column

@rsum(col)              running sum of values in the
			            column

@csum(col1,col2)        controlled sum; sum the values
			            in col1 using col2 as a control
			            column, restarts the sum for a
			            change in the value in col2

@crsum(col1,col2)       controlled running sum; running
			sum of values in col1 but restarts
			the sum for a change in the value
			in col2

@vmax(col)              calculates the maximum in the column

@vmin(col)              calculates the minimum in the column

@cvmax(col1,col2)       controlled maximum; calculates the
			maximum in col1 using col2 as a control
			column, restarts the max for a change
			in the value in col2

@cvmin(col1,col2)       controlled minimum; calculates the
			minimum in col1 using col2 as a control
			column, restarts the min for a change
			in the value in col2

@diff(col)              subtracts the value in the previous
			record from the value in the current
			record

@cdiff(col1,col2)       subtracts the value in the previous
			record from the value in the current
			record; restarts the operation for a
			change in the value in col2

@shift(col,n)           shifts downward n records,
			negative n for upward shift;
			downward shift replicates first
			value in column while upward
			shift replicates last value

@rotate(col,n)          same as shift except values that
			are rotated off the end of the
			column are wrapped around to the
			other end

@interp(col1,col2)      replace zero values between non-zero
			values in col1 by interpolating
			between the non-zero values in col1
			to corresponding values in col2;
			col2 may contain @index in which case
			interpolation is linear or it may
			contain some other function
			(i.e. logarithmic or exponential)

@fill(col)              fill the zeros in the column with the
			previous non-zero value in the column

@dist(lon1,lat1,lon2,lat2,dist)     calculate the distance in meters
				    between the two geographic points
				    on the Earth.  A spherical formula
				    is used above 1.05 degrees and a
				    plane formula is used below .95
				    degrees of central arc.  Between
				    these values, both formulas are used
				    and the result is a linear
				    interpolation of both formulas.
				    This is done to give a continuous
				    result.  Results near the poles
				    are not guaranteed accurate.

@head(lon1,lat1,lon2,lat2,head)     calculate the heading of the line
				    from the first to the second point
				    in degrees clockwise from north.
				    The interpolation technique used
				    in @dist is applied here.

@bear(lon1,lat1,lon2,lat2,lon3,lat3,bear) calculate the bearing of the
					  line from the first to the
				    second point clockwise in degrees
				    from the line from the first to the
				    third point.  The interpolation
				    technique used in @dist is
				    applied here.


A full example of an fstring to calculate a
time increment dt from a column t is
fstring="dt=t$shift(dt,-1)$dt=t-dt"

RESTRICTIONS

1.     Maximum number of columns in one execution is 100.
2.     The number of columns in the IBIS file is not limited here.
3.     Maximum input string length is 10,000 (40 x 250).
4.     Maximum number of operations is 3000.
5.     Maximum number of temp locations is 938.
6.     Maximum number of constants from the expression is 960.

notes:

1.  Column numbers greater than 100 are mapped sequentially 1,2,3...
3.  The input parameter is a string array (40) each with 250 char
    The array is concatenated by the program, be careful to 
    distinguish the TAE-TCL continuation + from function + by using
    quotes around each member of the array.
4.  These can be counted by setting debug to one and counting the
    lines that begin with "xknuth:op,opnd".  The count is not
    easily determined by looking at a long input.
5.  These can be counted by setting debug to one and counting the
    lines that begin with "xknuth:op,opnd" and having an opnd
    value above 1061.  The count is not easily determined by
    looking at a long input.
6.  These can be counted in the input, or by setting debug to one
    and counting the lines that begin with "xknuth:op,opnd" and
    having an opnd value between 103 and 1061, inclusive.

SEED


Suppose that mf4 is used to set two columns to random values, or
a function involving a random values, in two executions of mf4.
Since the same random sequence would be used, the columns
would correlate.  To avoid this, use different values of seed
in the two executions of mf4 to get different random sequences
(actually two subsequences of a very long equirandom sequence).
For more on the math, see the SUN documentation on srand48.

DEBUG

The symbols are read in a left to right parse. Look at the code in sp_knuth and its subroutines to interpret. The operations are listed in sp_xknuth. Operands are: 0-100 columns (mapped 0,1,2,3... regardless of actual columns) 101 random number 102 row index 103-1061 constants from the expression, or a string ref 1062+ temp locations, or temp string refs The old optimizer from 1975 has been turned off because of algorithm changes (see code) so there will be some inefficiencies in the load and store from temp locations.

CODE

Set 1 to see pseudo code