Ab Initio Component | LEADING RECORDS

Purpose of LEADING RECORDS


  • LEADING RECORDS copies a specified number of records from its in to its out port, counting from the first record in the input flow.


Parameters for LEADING RECORDS


num_records (integer, required)


  • This parameter specifies the number of records to copy from the in port to the out port. If you enter a value of -1, it specifies that all records appearing on the component’s in port should be passed through to its out port.


early-close    (boolean, required)


  • When this parameter set to True, LEADING RECORDS closes its output immediately after reaching the value specified in num_records, which speeds downstream processing. In a non-continuous graph, always set early-close to True.


    Default is False. Use this setting only for continuous graphs.
    

Runtime behavior of LEADING RECORDS

  • When this component is connected to an INPUT FILE component, LEADING RECORDS has a useful optimization: it stops reading the input file as soon as it reaches the specified number of records. To get this optimization, make sure the INPUT FILE and the LEADING RECORDS use the same layout.

  • When this component is connected to an INPUT TABLE component  this optimization does not apply.With an INPUT TABLE, it is most efficient to select the desired records using a SELECT statement.

  • LEADING RECORDS does not have a port to which to send unused records.

  • LEADING RECORDS supports implicit reformat. 

Ab Initio Component | Filter by Expression: Part 2

 ...continue from part 1....


ramp (real, required)

  • This parameter defines Rate of toleration of reject events in the number of records processed.

  •  When the reject-threshold parameter is set to Use limit/ramp, the component uses the values of the ramp and limit parameters in a formula to determine the component’s tolerance for reject events.

    Default is 0.0.
 

logging (boolean, optional)

  • This parameter  specifies whether the component logs certain events.  
 

log_input  (choice, optional)

  • This parameter specifies how often the component sends an input record to its log port. The logging parameter must be set to True for this parameter to be available.

  • For example, if you select 100, the component sends every 100th input record to its log port.
 

log_output   (choice, optional)

  • This parameter specifies how often the component sends an output record to its log port. The logging parameter must be set to True for this parameter to be available.

  • For example, if you select 100, the component sends every 100th output record to its log port.

log_reject   (choice, optional)

  • This parameter specifies how often the component sends a reject record to its log port. The logging parameter must be set to True for this parameter to be available.

 

Runtime behavior of FILTER BY EXPRESSION


FILTER BY EXPRESSION perform the following operations :

    1. It reads data records from the in port.

  • If the use_package parameter is false, applies the expression in the select_expr parameter to each record. It routes records as follows, based on how the expression evaluates:

  • For a non-0 value, FILTER BY EXPRESSION writes the record to the out port.

  • For 0, FILTER BY EXPRESSION writes the record to the deselect port. If you do not connect a flow to the deselect port, FILTER BY EXPRESSION discards the records.

  • For NULL, FILTER BY EXPRESSION writes the record to the reject port and a descriptive error message to the error port.

  • If the use_package parameter is true, executes the functions defined in the package:

  • If the select function returns 1, the component writes the record to the out port.

  • If the select function returns 0, the component writes the record to the deselect port.


    2. If output_for_error or make_error is defined, executes them whenever an error event occurs. If log_error is defined and logging of rejects is turned on, executes log_error.

    3. FILTER BY EXPRESSION stops execution of the graph according to the reject-threshold parameter. If its value is use limit/ramp, the graph stops when the number of reject events exceeds the result of the following formula:

        limit + (ramp *  number_of_records_processed_so_far)

Ab Initio Component | Filter by Expression: Part 1

 Purpose of FILTER BY EXPRESSION

  •  FILTER BY EXPRESSION is used to filter records according to a DML expression or transform function, which specifies the selection criteria.
  • FILTER BY EXPRESSION can also sometimes used to create a subset, or sample, of the data. For example, you can configure FILTER BY EXPRESSION to select a certain percentage of records, or to select every third (or fourth, or fifth, and so on) record. 
 

Parameters for FILTER BY EXPRESSION


select_expr   (expression, required when use_package is false)
 
  • This parameter filters records according to the DML expression you specify.

use_package  (boolean, optional)

  • This parameter controls whether the component uses the select_expr parameter or the package to specify the filter criteria. When the value is true, it uses the package.
  • When false (the default), you may still use the package to customize the component’s handling of error and log information.

package   (filename or embedded string, optional)

  • This parameter is package that can include a select function (required when use_package is true). It also allows you to customize the component’s handling of error and log information.
 
error_group  (string, optional)

  • This parameter defines name of the error group to which this component belongs. It sends its error output to the HANDLE ERRORS component with a matching error_group value.
    
log_group  (string, optional)

  • This parameter defines name  of the log group to which this component belongs. It sends its log output to the HANDLE LOGS component with a matching log_group value.

reject-threshold  (choice, required)

  • This parameter specifies the component’s tolerance for reject events.

limit  (integer, required)

  • This parameter defines a number representing reject events.

  • When the reject-threshold parameter is set to Use limit/ramp, the component uses the values of the ramp and limit parameters in a formula to determine the component’s tolerance for reject events.

    Default is 0.

 



Ab Initio Component | SCAN:Part 3

 ....Continue from Part 2.....

Runtime behavior of SCAN

 

SCAN perform following operation for each group of records:

 

1. Performing Input selection:

  • If you have defined the input_select function, SCAN filters the input records accordingly.

  • However if you have not defined the input_select function in your transform, SCAN processes all records.

2. Performing Key change (for sorted input only):

  • For every record except the first, SCAN checks whether a key change has occurred:

  • SCAN compares the current record’s key value to the previous record’s key value, unless the key_change function is defined.

  • If the key_change function is defined, SCAN calls that function to check for a key change.

3. Performing Temporary initialization:

  • SCAN passes the first record in each group to the initialize transform function.

4. Performing Computation:

  • SCAN calls the scan transform function for each record in a group, including the first, using the input record and the temporary record for the group to which the input record belongs. The scan transform function returns a new temporary record.

5. Finalizing the output:

  • SCAN calls the finalize transform function once for every input record. SCAN passes the input record and the temporary record that the scan function returned to the finalize transform function. The finalize transform function produces an output record for each input record.

  • SCAN stops execution of the graph when the number of reject events exceeds the result of the following formula:

           limit+(ramp* number_of_records_processed_so_far)

6. Output selection:

  • If you have defined the output_select transform function, SCAN filters the output records.

Ab Initio Component | SCAN:Part 2

 .....Continue from Part 2.....


maintain-order(boolean, required)

  • This parameter is available only when the sorted-input parameter is set to False.

  • When the input is too large to fit within the memory limit specified by max-core, the maintain-order parameter, when set to True, stops the graph, ensuring that records are not reordered.

  • When the parameter is set to False (the default), the component stores some of its intermediate results in temporary files on disk. This alters the order of records.

Default is False.

grouped-input (boolean, optional)

  • This parameter is available only when the sorted-input parameter is set to False.

  • Set this parameter to Data is grouped by a major key in order to specify the major-key by which the input is sorted or grouped. In this case, the key parameter becomes the minor key: it is the field (or fields) to be scanned.

  • When you specify a major key, SCAN is more efficient in its use of memory: SCAN clears its in-memory table of intermediate results at the end of each major-key group of input records.

Default is Data is not grouped by a major key.

major-key(key specifier, optional)

  • This parameter is available only when the grouped-input parameter is set to Data is grouped by a major key. Specifies a field or set of fields by which the input data is sorted or grouped. 

 check-sort(boolean, optional)

  • This parameter is available only when the sorted-input parameter is set to True and the key-method parameter is set to Use key specifier.

  • This parameter indicates whether the component should fail when it first encounters an input record that is out of sorted order. Setting this parameter to False effectively treats every key change as a change in group.

Default is True.
 

reject-threshold(choice, required)

  • Specifies the component’s tolerance for reject event