Ab Initio Component | INPUT FILE

 Purpose of Input File :

  • INPUT FILE represents records read as input to a graph from one or more serial files or from a multi-file.
  • INPUT FILE can also be used to read the files from Hadoop file system,amazon S3 and google cloud storage
  • INPUT FILE is not a phased component
  • INPUT FILE can be used only in batch graph and cannot be used in continuous flow graph.

 

Parameter of INPUT FILE 

 

1. Data Tab:

Use the Data tab to specify the following:

  • The path to a file reusable dataset
  • The physical location for a data file
  • If appropriate, an alternative means to associate a specified data file with an EME dataset in the EME Technical Repository

Reusable dataset

  • Specifies the use and location of a file reusable dataset that is preconfigured to access a particular set of data. Using this option configures the component as a dataset-linked component. For more information, see “Reusable datasets” in the Co>Operating System Graph Developer’s Guide.
    Reuse an existing dataset .

Data location 

    Specifies the data location as:

  • The URL of a serial file or of a multifile in a multifile system
  • The URLs of the individual partitions of an ad hoc multifile

File details

Opens a window, where you can see the following information about the file that corresponds to the specified data location:

  •     Permissions on the file
  •     Owner of the file
  •     Size of the file in bytes
  •     Date and time the file was last modified
  •     Full pathname of the file
  •     Any resolution errors 
 
 

2. Access Tab:

Below are the option available for File handling in Access Tab

  1. If the file does not exist  Create file : Creates the output or intermediate file before writing to it.

    By default, this option is selected.
  2. If the file does not exist Fail : Forces the graph to fail if the file does not exist. 
  3. If the file exists Delete and recreate file :Deletes the output or intermediate file and creates a new one before writing to it.By default, this option is selected.
  4. If the file exists Append to file: Writes output to the end of the intermediate or output file. 
  5. If the file exists Fail : Forces the graph to fail if the file exists.
  6. Upon job failure, roll the file back to the last checkpoint : Rolls the file back and discards output if the job fails in the phase writing the file, or fails in a subsequent phase before the next checkpoint.

    By default, this option is selected.
  7. Delete file after the last phase that reads it completes :Removes the input or intermediate file after the last phase that reads it has finished running.

    By default, this option is not selected.
  8. Write file only when phase completes : Instead of writing the data file incrementally, writes the file when the phase has run to completion. This ensures that a separate process that is looking for the file while the graph is running does not pick up a partially written file.

    By default, this option is not selected.

3. File protection

  • Sets permissions to the input, output, and intermediate files. (Default settings are those assigned at file creation.) The checkboxes match the Unix file protection standards: Read (R), Write (W), and Execute (X) for User, Group, and Other. 

4. Ports

  • Used for providing the DML of the file which can be used to map data in the file







No comments:

Post a Comment