Showing posts with label Rollup component. Show all posts
Showing posts with label Rollup component. Show all posts

Ab Initio Component | ROLLUP : Part 3

 ....Continue from Part 2....

 

Function used in expanded mode 

  • Expanded mode provides more control over the transform. It lets you edit the expanded package, so you can specify transformations that are not possible with template mode 

  • With an expanded ROLLUP package, you must define the following function in it:

  • DML type named temporary_type

  • initialize function that returns a temporary_type record

  • rollup function that takes two input arguments (an input record and a temporary_type record) and returns an updated temporary_type record

  • finalize function that returns an output record

     

Runtime behavior of ROLLUP 

ROLLUP perform following operation for each group of records: 

1. Performing Input selection:

  • If you have not defined the input_select function in your transform, ROLLUP processes all records.

  • If you have defined the input_select function, ROLLUP filters the input records accordingly.

2. Performing Key change (for sorted input only):

  • For every record except the first, ROLLUP checks whether a key change has occurred:

  • ROLLUP compares the current record’s key value to the previous record’s key value, unless the key_change function is defined.

  • If the key_change function is defined, ROLLUP calls that function to check for a key change.

3. Temporary initialization:

  • ROLLUP passes the first record in each group to the initialize transform function.

4. Performing Computation:

  • ROLLUP calls the rollup transform function for each input record.
  • The input to the rollup transform function is the input record and the temporary record for the group to which the input record belongs.
  • The rollup transform function returns an updated temporary record for that input group. 

5. Performing Finalization of  the output:

With sorted-input set to True:

  • ROLLUP calls the finalize transform function after it processes all the input records in each group.

  • ROLLUP passes the temporary record for the group and the last input record in the group to the finalize transform function.

  • The finalize transform function produces an output record for the group.

Note:

  • For sorted-input set to False  ROLLUP processes all the input records, it calls the finalize transform function with the temporary record for each group and an arbitrary input record from each group as arguments.

  • ROLLUP repeats this procedure with each group.

  • The finalize transform function then produces an output record for each group.

  • The component stops the execution of the graph when the number of reject events exceeds the result of the following formula:

limit+(ramp* number_of_records_processed_so_far)

6. Output selection:

  • If you have defined the output_select transform function, it filters the output records.

Ab Initio Component | ROLLUP : Part 2

 ...Continue from part 1......

max-core(integer, required)

  • This parameter define maximum memory usage in bytes.

  • It is available only when the sorted-input parameter is set to False.

  • If the total size of the intermediate results that the component holds in memory exceeds the number of bytes specified in the max-core parameter, the component writes temporary files to disk.

Default is 67108864 (64 MB).

reject-threshold(choice, required)

  • Specifies the component’s tolerance for reject events i.e after how many reject records the component should abort its operation

check-sort(boolean, optional)

  • This parameter is available only when the sorted-input parameter is set to True and the key-method parameter is set to Use key specifier.

 Difference between using unsorted and sorted data

With unsorted data

  • When the input data is not sorted (and the sorted-input parameter is set to False), the function outputs an arbitrary record from each group. This might not be particularly useful.
  • To get the first or last record in the unsorted data, you can use the first or last aggregation function.

With sorted data

  • When the input data is sorted (and the sorted-input parameter is set to True), the function outputs the last record from each group.
  • In this case, the function is equivalent to the following, which uses the last aggregation function 


Ab Initio Component | ROLLUP : Part 1

 Purpose of ROLLUP

  • ROLLUP is used to process groups of input records that have the same key, generating one output record for each group. 

  • Typically, the output record is summary or aggregates the data in some way; for example, a simple ROLLUP can be used to calculate a sum or average of one or more input fields.

  • ROLLUP can also be used to select certain information from each group; for example, it might output the largest value in a field, or accumulate a vector of values that conform to specific criteria.

Two modes to use ROLLUP

You can use a ROLLUP component in two modes, depending on how you define the transform parameter:

1. Template mode — You define a simple rollup function that may include aggregation functions. Template mode is the most common/simple way to use ROLLUP.

2. Expanded mode — You create a transformation using an expanded rollup package. This mode allows for rollups that do not necessarily use regular aggregation functions.


Parameters for ROLLUP (Not all parameters are covered.)

 sorted-input(boolean, required)

  • This parameter to specifies whether the component accepts unsorted (or ungrouped) input.

  • If you want to process ungrouped input, set this parameter to False.

Default is True.

key-method (choice, optional)

  • This parameter determines the method by which the component determines the boundary between one group of records and the next. The choices are as follows:

1. Use key specifier — The component uses one or more of the fields in the input record as the grouping key.

2. Use key_change function — Instead of using fields from the input record to group the input, the component uses the key_change transform function to determine when a new group begins.


Default is Use key specifier.

key(key specifier, required when key-method is Use key specifier)

  • This parameter contain the name(s) of the key fields that the component can use to group or define groups of records.

 transformp(filename or string, required)

  • This parameter contains either the name of the file containing the types and transform functions, or a transform string.

output_without_input(choice, optional)

  • This parameter specifies the event that, when received, triggers the component to call the output_without_input function, if no input records have been received since the last such event or since the component started. The choices are as follows:

Never — The function will not be called.

At each computepoint — The function is called at each computepoint event.

At each checkpoint — The function is called at each checkpoint event.

At component shutdown — The function is called when the component is shutdown.

Default is Never.