Purpose of SCAN
- For every input record, SCAN generates an output record that consists of a running cumulative summary for the group to which the input record belongs, up to and including the current record
- SCAN is similar to ROLLUP. The difference between the two is that SCAN produces one output record for each input record, while ROLLUP produces one output record for each key group
Two modes to use SCAN
Unlike ROLLUP SCAN can also be used in template mode and expanded mode
- Template mode — You define a simple scan function that typically includes aggregation functions.
- Expanded mode — You create a transform using an expanded scan package. This mode allows for scans that do not necessarily use regular aggregation functions.
Parameters for SCAN (Not all Parameters are covered)
sorted-input(boolean, required)
- This parameter specifies whether the component accepts unsorted (or ungrouped) input.
- If you want to process ungrouped input/data, set this parameter to False.
Default is True.
key-method(choice, optional)
- This parameter is defines method by which the component determines the boundary between one group of records and the next. The choices are as follows:
- Use key specifier — The component uses one or more of the fields in the input record as the grouping key.
- Use key_change function — Instead of using fields from the input record to group the input, the component uses the key_change transform function to determine when a new group begins.
key(key specifier, required when key-method is Use key specifier)
- This parameter consists names of the key fields that the component can use to group or define groups of records.
transform(filename or string, required)
- This param consists of either the name of the file containing the types and transform functions, or a transform string.
max-core (integer, required)
Default is 67108864 bytes (64 MB).
- This parameter define maximum memory usage in bytes.
- This parameter is available only when the sorted-input parameter is set to False.
- If the total size of the intermediate results that the component holds in memory exceeds the number of bytes specified in the max-core parameter, the component writes temporary files to disk.
Default is 67108864 bytes (64 MB).
No comments:
Post a Comment