.....Continue from part 2.....
maintain-order (boolean, required)
Default is False.
- Set to True to ensure that records remain in the original order of the driving input. (The driving input is the largest input, as specified by the driving parameter.)
- Available only when the sorted-input parameter is set to False. If the sorted-input parameter is set to True and all inputs are sorted on the fields given in the key parameter, the output maintains the sort order on that key without the use of this parameter.
- If any inputs other than the driving input are too large to fit within the memory limit specified by max-core, the behavior of the component depends on the setting of maintain-order:
- False — The component stores some of its intermediate results in temporary files on disk. This alters the order of records in the driving input.
- True — The component stops execution of the graph.
Default is False.
max-core (integer, required)
- Maximum memory usage in bytes. Available only when the sorted-input parameter is set to False.
- If the total size of the non-driving inputs that the component holds in memory exceeds the number of bytes specified in the max-core parameter, the component writes temporary files to disk.
Runtime behavior of JOIN
JOIN performs following Operations:
1. Reads data records from multiple inn ports. Depending on the setting of the sorted-input parameter, it does one of the following:
- If input is sorted, it reads records in the order in which they arrive.
- In input is unsorted, it loads all records from all inputs except the driving input into main memory. Once the non-driving inputs are loaded, it reads records from the driving input in the order in which they arrive.
2. Applies the expression in any defined selectn parameter to the records on the corresponding inn port:
- If the value of select expression evaluates to 0 for a record the join components does not process the record, and the record does not appear on any output port
- Evaluates to anything other than 0 or NULL for a particular record Processes the record
- If you do not supply an expression for a selectn parameter, JOIN processes all the records on the corresponding inn port
3. Removes any duplicate records that have made it through the select if dedupn parameter to True.
4.Operates on records that have matching key values using a multi-input transform function.
If the transform function returns NULL, then JOIN:
- Writes each input record to the corresponding rejectn port, then stops execution of the graph when the number of reject events exceeds the result of the following formula:
limit + (ramp * number_of_records_processed_so_far)
- Writes an error message to the corresponding errorn port.If no flows are connected to rejectn or errorn ports, JOIN component discards the information
5. Writes the non-NULL return record from the transform function to the out port.