Ab Initio | GDE: Input File

Purpose of Input File :

INPUT FILE represents records read as input to a graph from one or more serial files or from a multi-file.
INPUT FILE can also be used to read the files from Hadoop file system,amazon S3 and google cloud storage
INPUT FILE is not a phased component
INPUT FILE can be used only in batch graph and cannot be used in continuous flow graph.

The path to a file reusable dataset
The physical location for a data file
If appropriate, an alternative means to associate a specified data file with an EME dataset in the EME Technical Repository

Specifies the use and location of a file reusable dataset that is preconfigured to access a particular set of data. Using this option configures the component as a dataset-linked component. For more information, see “Reusable datasets” in the Co>Operating System Graph Developer’s Guide.
Reuse an existing dataset .

Specifies the data location as:

Opens a window, where you can see the following information about the file that corresponds to the specified data location:

Below are the option available for File handling in Access Tab

If the file does not exist Create file : Creates the output or intermediate file before writing to it.

By default, this option is selected.
If the file does not exist Fail : Forces the graph to fail if the file does not exist.
If the file exists Delete and recreate file :Deletes the output or intermediate file and creates a new one before writing to it.By default, this option is selected.
If the file exists Append to file: Writes output to the end of the intermediate or output file.
If the file exists Fail : Forces the graph to fail if the file exists.
Upon job failure, roll the file back to the last checkpoint : Rolls the file back and discards output if the job fails in the phase writing the file, or fails in a subsequent phase before the next checkpoint.

By default, this option is selected.
Delete file after the last phase that reads it completes :Removes the input or intermediate file after the last phase that reads it has finished running.

By default, this option is not selected.
Write file only when phase completes : Instead of writing the data file incrementally, writes the file when the phase has run to completion. This ensures that a separate process that is looking for the file while the graph is running does not pick up a partially written file.

By default, this option is not selected.

Sets permissions to the input, output, and intermediate files. (Default settings are those assigned at file creation.) The checkboxes match the Unix file protection standards: Read (R), Write (W), and Execute (X) for User, Group, and Other.

Used for providing the DML of the file which can be used to map data in the file