Using a Procedural Interface for SORT

The SORT/MERGE procedural interface provides a means for those languages—such as Pascal—that do not support a SORT or MERGE statement to invoke the SORT utility in the MCP. This is done through library calls to procedures made available in the system library GENERALSUPPORT. For a program to invoke these library entry points, the program must be in a language that supports the type of parameters described for these entry points.

SORTFILES

This procedure sorts an input file into a specified output file. The declaration of this procedure in GENERALSUPPORT is as follows:

PROCEDURE SORTFILES (INFO_STRING,
                     TRANS_TABLE,
                     COMPARE_PROC,
                     OUTPUT_FILE,
                     INPUT_FILE);
EBCDIC ARRAY INFO_STRING [*];
ARRAY TRANS_TABLE [*];
BOOLEAN PROCEDURE COMPARE_PROC (A, B);
   ARRAY A, B[*]; FORMAL;
FILE   INPUT_FILE, OUTPUT_FILE;

Input Attributes for the SORTFILES Procedures describes the input attributes for the procedure.

Table 73.  Input Attributes for the SORTFILES Procedures

Attribute

Explanation

INFO_STRING

A string in which the user can describe some additional parameters for the sort. The format of this string is described following this table.

TRANS_TABLE

An optional array containing a translate table that identifies the alternate collating sequence to use for the sort. Translation of keys using the collating sequence is performed only on alphanumeric key types. This parameter is used only if the COLLSEQ clause is present in the INFO_STRING parameter and the user compare procedure is not to be used. The TTABLE routine should have been used to build the translate table prior to passing this array for use by sort. The TTABLE routine is described under “TTABLE” later in this section. If TRANS_TABLE is used, the size of this array should be at least 64 words in length.

COMPARE_PROC

A reference to the user's compare procedure. The compare procedure must be coded as a Boolean function with two array parameters. The function is called every time two records are to be compared. This procedure is called by the sort only if an indication to use it is present in the info string (COMP). The alternative to using a compare procedure is to include the key description in the KEY clause of the INFO_STRING parameter.

OUTPUT_FILE

The file to which the output from the sort is written. If the file is a disk file for which the file attribute SAVEFACTOR has a nonzero value, it is closed and locked after the sort. The output file must not be open when it is passed to SORTFILES by the program.

INPUT_FILE

The input file to be used during the sort. The input file must not be open when it is passed to SORTFILES by the program.


The INFO_STRING parameter provides the means for the user to specify additional parameters for the sort or merge. This parameter is specified in the form of specific clauses. The clauses must be separated by blanks or commas, and the entire string must be terminated by a period. The only required clauses are RSZ, and KEY or COMP.

Clauses That Can Be Specified in the SORTFILES Info String shows the clauses that can be specified in the info string.

Table 74. Clauses That Can Be Specified in the SORTFILES Info String

Clause

Explanation

RSZ=record length

Record length in bytes.

KEY=key-desc

Describes a single key.

KEY=(d,...,d)

Describes several keys. The first key listed is the primary key. Subsequent keys are used to further sort or merge any records that were identical in terms of the previous key. Each key description takes the form: position/length/sequence/type.

  • Position – Position beginning at 1, in bytes, unless hex type. If type is hex, position must be specified in hex digits

  • Length – In bytes, unless hex type. If type is hex, length must be specified in hex digits. (default 1)

  • Sequence – A (ascending) or D (descending). Default is A.

  • Type – Character specifying format of the key. Each format is byte-or hex-oriented. This orientation affects position and length.

    The types are:

    • N - display numeric (default, byte-oriented)

    • S - alphabetic/alphanumeric (byte-oriented)

    • B - integer binary (byte-oriented)

    • E - real, binary with exponent (byte-oriented)

    • X - double, extended (byte-oriented)

    • V - left overpunched sign (byte-oriented)

    • Q - right overpunched sign (byte-oriented)

    • R - left separate sign (byte-oriented)

    • T - right separate sign (byte-oriented)

    • U - unsigned (hex-oriented)

    • L - left signed (hex-oriented)

    • P - right signed (hex-oriented)

COMP

Appears only if a comparison routine is used. Either COMP or KEY must appear, but not both, meaning that if there is no comparison routine, then KEY must be used.

COLLSEQ

Indicates an alternate collating sequence is to be used.

DUPL

Indicates that duplicate keys should be ordered based on order of appearance (for SORT only).

CORE=size

Core size, in words (default 12000).

WORKFILE=type

Describes type of a single workfile. WORKFILE can be abbreviated to WORK.

WORKFILE=(t,t)

Describes disk and tape workfiles. The workfile type can be:

  • DISK/size. Indicates disk workfile. Size is number of words. Default size is 600,000 words.

  • TAPE/#tapes. Indicates tape workfiles. #tapes indicates number of tapes between 3 and 8. Default number is 3.

If no workfiles are specified, a memory only sort is performed.

CCSVERSION=name

Specifies coded character set version

CCSVERSION

Uses default version if no name is specified.

RESTART=r-val

Describes restart functions. The default is 0 (zero).

The allowable values for r-val are:

  • 0 – No restart capability.

  • 1 – Restart previous sort. The prior uncompleted sort must have been capable of restart.

  • 2 – Allow restartable sort.

  • 4, 6, or 12 – Allow restartable sort and enable extensive error recovery from I/O errors.

  • 9 – Restart previous sort if all input has been received. The prior uncompleted sort must have been capable of restart.

  • 10 and 14 – Low restartable sort after all input received.


SORTPROCS

This procedure performs the sort on a set of records where input is provided from a procedure in the user's program and output is sent to a procedure in the user's program. SORTPROCS should be used when there are files to be sorted which are variable in length. The declaration of this procedure in GENERALSUPPORT is as follows:

PROCEDURE SORTPROCS (INFO_STRING,
                     TRANS_TABLE,
                     COMPARE_PROC,
                     OUTPUT_PROC,
                     INPUT_PROC);
EBCDIC ARRAY INFO_STRING [*];
ARRAY TRANS_TABLE [*];
BOOLEAN PROCEDURE COMPARE_PROC (A, B);
   ARRAY A, B[*]; FORMAL;
PROCEDURE OUTPUT_PROC (B, A);
   VALUE B;
   BOOLEAN B;
   ARRAY A[*];    FORMAL;
BOOLEAN PROCEDURE INPUT_PROC (A);
   ARRAY A[*];    FORMAL;

Input Attributes for the SORTPROCS Procedure describes the input attributes for the procedure.

Table 75. Input Attributes for the SORTPROCS Procedure

Attribute

Explanation

INFO_STRING

A string in which the user can describe some additional parameters for the sort. (Refer to the description of SORTFILES for a complete description of this string).

TRANS_TABLE

An optional array containing a translate table that identifies the alternate collating sequence to use for the sort. Translation of keys using the collating sequence is performed only on alphanumeric key types. This parameter is used only if the COLLSEQ clause is present in the INFO_STRING parameter and the user compare procedure is not to be used. The TTABLE routine should have been used to build the translate table prior to passing this array for use by the sort. The TTABLE routine is described under “TTABLE” later in this section. If TRANS_TABLE is used, the size of this array should be at least 64 words in length.

COMPARE_PROC

A reference to the user's compare procedure. The compare procedure must be coded as a Boolean function with two array parameters. The function is called every time two records are to be compared. This procedure is called by the sort only if an indication to use it is present in the info string. The alternative to using a compare procedure is to include the key description in the KEY clause of the INFO_STRING parameter.

OUTPUT_PROC

The procedure in the user's program that receives each output record from the sort. This procedure takes two parameters. The first parameter is a Boolean parameter, and the second parameter is an array parameter which contains the output record. The array parameter can be any type which is compatible with an ALGOL Real array. The Boolean parameter contains FALSE as long as the second parameter contains a valid sorted record. When all the records have been returned, the first parameter is TRUE and the second parameter must not be accessed.

INPUT_PROC

The function which provides the input records to be used during the sort. This is a Boolean function with one array parameter in which the record to be used during the sort is provided. The function is to return a TRUE result to indicate the end of input data.


MERGEFILES

This procedure merges from two to eight files into one file. The declaration of this procedure in GENERALSUPPORT is as follows:

PROCEDURE MERGEFILES (INFO_STRING,
                      NUM_FILES,
                      TRANS_TABLE,
                      COMPARE_PROC,
                      OUTPUT_FILE,
                      INPUT_FILE1,
                      INPUT_FILE2,
                      INPUT_FILE3,
                      INPUT_FILE4,
                      INPUT_FILE5,
                      INPUT_FILE6,
                      INPUT_FILE7,
                      INPUT_FILE8);
EBCDIC ARRAY INFO_STRING [*];
ARRAY TRANS_TABLE [*];
BOOLEAN PROCEDURE COMPARE_PROC (A, B);
   ARRAY A, B[*]; FORMAL;
FILE OUTPUT_FILE,
     INPUT_FILE1, INPUT_FILE2, INPUT_FILE3, INPUT_FILE4,
     INPUT_FILE5, INPUT_FILE6, INPUT_FILE7, INPUT_FILE8;

Input Attributes for the MERGEFILES Procedure describes the input attributes for the procedure.

Table 76.  Input Attributes for the MERGEFILES Procedure

Attribute

Explanation

INFO_STRING

A string in which the user would describe some additional parameters for the merge (refer to “SORTFILES” earlier in this section for a complete description of this string).

NUM_FILES

Indicates how many of the input files are valid and are to be used during the merge.

TRANS_TABLE

An optional array containing a translate table which identifies the alternate collating sequence to use for the merge. Translation of keys using the collating sequence is only performed on alphanumeric key types. This is used only if an indication to use this is present in the Info string (COLLSEQ) and if the user compare procedure is not to be used. The TTABLE routine (described later in this section) should have been used to build the translate table prior to passing this array for use by the merge. If used, the size of this array should be at least 64 words in length.

COMPARE_PROC

A reference to the user's compare procedure. The compare procedure must be coded as a Boolean function with two array parameters. The function is called every time two records are to be compared. This procedure is only called by the merge if an indication to use it is present in the info string. The alternative to using a compare procedure is to include the key description in the info string.

OUTPUT_FILE

The file to which the output from the merge is written. If the file is a disk file for which the file attribute SAVEFACTOR has a nonzero value, it is closed and locked after the MERGE. The output file must not be open when it is passed to MERGEFILES by the program.

INPUT_FILE1

An input file to be used during the merge. The input files (INPUT_FILE1 - INPUT-FILE8)must not be open when they are passed to MERGEFILES by the program.

INPUT_FILE2

An input file to be used during the merge.

INPUT_FILE3

An input file to be used during the merge. This may be a dummy file if only 2 files are to be merged.

INPUT_FILE4

An input file to be used during the merge. This may be a dummy file if only 3 or less files are to be merged.

INPUT_FILE5

An input file to be used during the merge. This may be a dummy file if only 4 or less files are to be merged.

INPUT_FILE6

An input file to be used during the merge. This may be a dummy file if only 5 or less files are to be merged.

INPUT_FILE7

An input file to be used during the merge. This may be a dummy file if only 6 or less files are to be merged.

INPUT_FILE8

An input file to be used during the merge. This may be a dummy file if only 7 or less files are to be merged.


Note: If the user wishes to merge only two files, the number two would be entered for the NUM_FILES parameter, the first two input file parameters would be the files that would be merged, and dummy files would be entered for the remaining input files. These files are ignored by the library procedure.

MERGEPROCS

This procedure uses from two to eight input procedures to perform the merge, giving the results to the output procedure. MERGEPROCS should be used when there are files to be merged which are variable in length. The declaration of this procedure in GENERALSUPPORT is as follows:

PROCEDURE MERGEPROCS (INFO_STRING,
                      NUM_PROCS,
                      TRANS_TABLE,
                      COMPARE_PROC,
                      OUTPUT_PROC,
                      INPUT_PROC1,
                      INPUT_PROC2,
                      INPUT_PROC3,
                      INPUT_PROC4,
                      INPUT_PROC5,
                      INPUT_PROC6,
                      INPUT_PROC7,
                      INPUT_PROC8);
EBCDIC ARRAY INFO_STRING [*];
ARRAY TRANS_TABLE [*];
BOOLEAN PROCEDURE COMPARE_PROC (A, B);
   ARRAY A, B[*]; FORMAL;
PROCEDURE OUTPUT_PROC (B, A);
   VALUE B;
   BOOLEAN B;
   ARRAY A[*];    FORMAL;
BOOLEAN PROCEDURE INPUT_PROC1 (A);
   ARRAY A[*];    FORMAL;
BOOLEAN PROCEDURE INPUT_PROC2 (A);
   ARRAY A[*];    FORMAL;
BOOLEAN PROCEDURE INPUT_PROC3 (A);
   ARRAY A[*];    FORMAL;
BOOLEAN PROCEDURE INPUT_PROC4 (A);
   ARRAY A[*];    FORMAL;
BOOLEAN PROCEDURE INPUT_PROC5 (A);
   ARRAY A[*];    FORMAL;
BOOLEAN PROCEDURE INPUT_PROC6 (A);
   ARRAY A[*];    FORMAL;
BOOLEAN PROCEDURE INPUT_PROC7 (A);
   ARRAY A[*];    FORMAL;
BOOLEAN PROCEDURE INPUT_PROC8 (A);
   ARRAY A[*];    FORMAL;

Input Attributes for the MERGEPROCS Procedure describes the input attributes for the procedure.

Table 77.  Input Attributes for the MERGEPROCS Procedure

Attribute

Explanation

INFO_STRING

A string in which the user can describe some additional parameters for the merge. Refer to “SORTFILES” earlier in this section for a complete description of this string.

NUM_PROCS

Indicates how many of the input procedures are valid and are to be used during the merge.

TRANS_TABLE

An optional array containing a translate table which identifies the alternate collating sequence to use for the merge. Translation of keys using the collating sequence is performed only on alphanumeric key types. This parameter is used only if the COLLSEQ clause is present in the INFO_STRING parameter and the user compare procedure is not to be used. The TTABLE routine should have been used to build the translate table prior to passing this array for use by the merge. the TTABLE routine is described under “TTABLE” later in this section. If TRANS_TABLE is used, the size of this array should be at least 64 words in length.

COMPARE_PROC

A reference to the user's compare procedure. The compare procedure must be coded as a Boolean function with two array parameters. The function is called every time two records are to be compared. This procedure is called by the merge only if an indication to use it is present in the info string. The alternative to using a compare procedure is to include the key description in the KEY clause of the INFO_STRING parameter.

OUTPUT_PROC

The procedure in the user's program that receives each output record from the merge. This procedure takes two parameters. The first parameter is a Boolean parameter, and the second parameter is an array parameter which contains the output record. The Boolean parameter contains FALSE as long as the second parameter contains a valid sorted record. When all the records have been returned, the first parameter is TRUE and the second parameter must not be accessed.

INPUT_PROC1 through INPUT_PROC8

The functions which provide the input records to be used during the merge. These are Boolean functions each with one array parameter in which the record to be used during the merge is provided. The functions are to return a TRUE result to indicate the end of input data. Only INPUT_PROC1 and INPUT_PROC2 are required, the rest of the functions can be dummy functions which are never accessed during the merge.


Note: If the user wishes to use only two procedures for the merge, the number 2 would be entered for the NUM_PROCS parameter, the first two input procedure parameters would be the procedures that would be used for the merge, and dummy procedures would be entered for the remaining input procedures. Dummy procedures are ignored by the library procedure.

TTABLE

This routine provides the means for the user to specify an alternate collating sequence to be used for the sort or merge procedural interface. This routine creates a translate table based on the user's specification of a collating sequence. The format for this library procedure declaration is as follows:

PROCEDURE TTABLE (IN_ARRAY,
                  TRANS_TABLE);
  EBCDIC ARRAY IN_ARRAY [*];
  ARRAY TRANS_TABLE [*];

Input Attributes for the TTABLE Procedure describes the input attributes for the procedure.

Table 78.  Input Attributes for the TTABLE Procedure

Attribute

Explanation

IN_ARRAY

The array that contains the user-specified collating sequence. The contents must be in the following form and must contain a terminating period to denote the end of the string:

  • If specifying a user defined collating sequence then:

                  [ -  literal-2                   ]
       literal-1  [ &  literal-3 [ & literal-4]... ] 
    
      [           [ - literal-6                  ] ] 
      [ literal-5 [ & literal-7 [ & literal-8]...] ]...
    • Where literals may either be character-strings delimited by quotation marks (") or numeric literals in the range of 0 through 255. Numeric literals denote the ordinal number of a character within the native character set. The literals following and preceding the ampersand (&) and hyphen (-) characters must be one character in length.

      Example

      "WGF" "R" & 232 90 - 100

    • The order in which the literals appear specifies, in ascending sequence, the ordinal number of the character within the collating sequence being specified.

    • Any characters within the native collating sequence that are not explicitly specified in the literal phrase assume a position greater than any of the explicitly specified characters in the collating sequence being specified. The relative order within the set of these unspecified characters is unchanged from the native collating sequence.

    • If the hyphen (-) character is used, the set of contiguous characters in the native character set beginning with the character specified by the value of literal-1 and ending with the character specified by the value of literal-2 is assigned a successive ascending position in the collating sequence being specified. A given hyphen (-) specification can specify characters of the native character set in either ascending or descending sequence.

    • If the ampersand (&) character is specified, the characters of the native character set specified by the value of literal-1, literal-3, literal-4, and so on are assigned to the same position in the collating sequence being specified.

  • If specifying the ASCII collating sequence then the IN_ARRAY parameter must contain only the word ASCII.

    ASCII is not delimited by double quotes, must be in uppercase, and must end in a period (.).

TRANS_TABLE

The array that receives the output from the routine and contains the resulting translate table. This array must be at least 64 words in length.


The translate table resulting from this routine can be passed in the SORTPROCS, SORTFILES, MERGEPROCS, or MERGEFILES routines with a COLLSEQ clause in the INFO_STRING parameter specifying that this table is to be used.

Sample Programs

SortProcs Example Program

PROGRAM P;
  TYPE
    Info_string_type   = PACKED ARRAY [1..500] OF CHAR;
    File_Buff          = PACKED ARRAY [1..35] OF CHAR;
    Array_type         = RECORD
      CASE BOOLEAN OF
        TRUE : (F_Arry   : File_Buff);
        FALSE:(R_Arry    : PACKED ARRAY [1..6] OF REAL;
      END;
    Trans_table_type   = ARRAY [1..64] OF REAL;
LIBRARY Gensupport (LIBACCESS = BYFUNCTION,
                    FUNCTIONNAME = 'GENERALSUPPORT');
PROCEDURE SortProcs  (VAR Info_string : Info_String_type;
                      VAR Trans_table : Trans_table_type;
                      FUNCTION Compare_proc
                              (VAR Ary1 : Array_type;
                               VAR Ary2 : Array_type):
                                                   BOOLEAN;
                      PROCEDURE Output_proc
                              (Done     : BOOLEAN;
                               VAR Ary  : Array_type);
                      FUNCTION Input_proc
                              (VAR Ary  : Array_type):
                                                    BOOLEAN);
           Gensupport;
VAR
  If_string     : Info_string_type;
  T_table       : Trans_table_type;
  In_file       : FILE OF File_buff;
  Out_file      : FILE OF File_buff;
  I_Status      : INTEGER;
FUNCTION Dummy_Comp_proc (VAR Ary1 : Array_type;
                          VAR Ary2 : Array_type) : BOOLEAN;
  BEGIN
  If Ary1.R.Arry[1] > Ary2.R.Arry[1] then
     Dummy_Comp_Proc :+ True
  else
     Dummy_Comp_proc := False;
  END;
PROCEDURE Output_proc (Done: BOOLEAN;
                       VAR Ary: Array_type);
  BEGIN
  IF Done THEN
     CLOSE (Out_file, SAVE)
  ELSE
     BEGIN
     Out_file  := Ary.F_arry;
     PUT (Out_file);
     END;
  END;
FUNCTION Input_Proc (VAR Ary: Array_type): BOOLEAN;
  BEGIN
  IF I_Status = IORES(EOF) THEN
    Input_proc := TRUE
  ELSE
    BEGIN
    I_Status := GET (In_file);
    Ary.F_array :=In_file@;
    Input_proc := FALSE;
    END;
  END;
BEGIN
  :
(* set file attributes for input and output file *)
  :
If_string := 'RSZ = 35, KEY = 10/3/A/S.';
OPEN (In_file);
OPEN (Out_file, NEW);
SortProcs (If_string, T_table, Dummy_comp_proc, Output_proc,
           Input_proc);
  :
END.

MergeFiles Example Program

PROGRAM P;
  TYPE
    Info_string_type   = PACKED ARRAY [1..500] OF CHAR;
    File_Buff          = PACKED ARRAY [1..35] OF CHAR;
    Array_type         = RECORD
      CASE BOOLEAN OF
        TRUE : (F_Arry : File_Buff);
        FALSE: (R_Arry : PACKED ARRAY [1..6] OF REAL;
      END;
    Trans_table_type   = ARRAY [0..64] OF REAL;
    File_Type          = SYSTEMFILE (CHAR);
LIBRARY Gensupport (LIBACCESS = BYFUNCTION,
                    FUNCTIONNAME = 'GENERALSUPPORT');
PROCEDURE MergeFiles (VAR Info_string : Info_String_type;
                      Num_files       : INTEGER;
                      VAR Trans_table : Trans_table_type;
                      FUNCTION Compare_proc
                              (VAR Ary1 : Array_type;
                               VAR Ary2 : Array_type):
                                                   BOOLEAN;
                      VAR Output_file : File_type;
                      VAR In_file1    : File_type;
                      VAR In_file2    : File_type;
                      VAR In_file3    : File_type;
                      VAR In_file4    : File_type;
                      VAR In_file5    : File_type;
                      VAR In_file6    : File_type;
                      VAR In_file7    : File_type;
                      VAR In_file8    : File_type);
           Gensupport;
  VAR
    If_string   : Info_string_type;
    T_table     : Trans_table_type;
    In_file1    : File_type;
    In_file2    : File_type;
    Out_file    : File_type;
    Dummy_file  : File_type;
 
 FUNCTION Comp_proc (VAR Ary1: Array_type;
                       VAR Ary2: Array_type): BOOLEAN;
     VAR
       S1 : STRING(35);
       S2 : STRING(35);
     BEGIN
     S1 := Ary1.F_arry;
     S2 := Ary2.F_arry;
     IF S1 <= S2 THEN
        Comp_proc:= TRUE
     ELSE
        Comp_proc:= FALSE;
     END;   
BEGIN
     : (* set file attributes for in_file1, in_file2, and out_file *)
     :
   If_string := 'RSZ = 35, COMP.';
   MergeFiles (If_string, 2, T_table, Comp_proc, Out_file,
              In_file1, In_file2,
              Dummy_file, Dummy_file, Dummy_file
              Dummy_file, Dummy_file, Dummy_file);
     :
   END.