Datastage - Complex Flat File Stage

 

The Complex Flat File (CFF) stage is a file stage. You can use the stage to read a file or write to a file, but you cannot use the same stage to do both.

As a source, the CFF stage can have multiple output links and a single reject link. You can read data from one or more complex flat files, including MVS data sets with QSAM and VSAM files. You can also read data from files that contain multiple record types. The source data can contain one or more of the following clauses:

1.       GROUP

2.       REDEFINES

3.       OCCURS

4.       OCCURS DEPENDING ON

 

CFF source stages run in parallel mode when they are used to read multiple files, but you can configure the stage to run sequentially if it is reading only one file with a single reader.

 

As a target, the CFF stage can have a single input link and a single reject link. You can write data to one or more complex flat files. You cannot write to MVS data sets or to files that contain multiple record types.

 

Editing a Complex Flat File stage as a source

 

To edit a CFF stage as a source, you must provide details about the file that the stage will read, create record definitions for the data, define the column metadata, specify record ID constraints, and select output columns.

 

To edit a CFF stage as a source:

1.      Open the CFF stage editor

2.      On the Stage page, specify information about the stage data:

a.        On the File Options tab, provide details about the file that the stage will read.

b.       On the Record Options tab, describe the format of the data in the file.

c.        If the stage is reading a file that contains multiple record types, on the Records tab, create record definitions for the data.

d.       On the Records tab, create or load column definitions for the data.

e.        If the stage is reading a file that contains multiple record types, on the Records ID tab, define the record ID constraint for each record.

f.        Optional: On the Advanced tab, change the processing settings.

3.     On the Output page, specify how to read data from the source file:

a.        On the Selection tab, select one or more columns for each output link.

b.       Optional: On the Constraint tab, define a constraint to filter the rows on each output link.

c.        Optional: On the Advanced tab, change the buffering settings.

4.     Click OK to save your changes and to close the CFF stage editor.

 

Creating record definitions

 

If you are reading data from a file that contains multiple record types, you must create a separate record definition for each type. COBOL copybooks with multiple record types can be imported as COBOL file definition (Eg. Insurance.cfd). Each record type is stores as a separate DataStage table definition (Eg. If the Insurance.cfd has 3 record types for Client, Policy and Coverage then there will be 3 table definitions one for each record type)

 

To create record definitions:

1.       Click the Records tab on the Stage page.

2.       Clear the Single record check box.

3.       Right-click the default record definition RECORD_1 and select Rename Current Record.

4.       Type a new name for the default record definition.

5.       Add another record by clicking one of the buttons at the bottom of the records list. Each button offers a different insertion point. A new record is created with the default name of NEWRECORD.

6.       Double-click NEWRECORD to rename it.

7.       Repeat steps 3 and 4 for each new record that you need to create.

8.       Right-click the master record in the list and select Toggle Master Record. Only one master record is permitted.

 

Column definitions

You must define columns to specify what data the CFF stage will read or write.

 

If the stage will read data from a file that contains multiple record types, you must first create record definitions on the Records tab. If the source file contains only one record type, or if the stage will write data to a target file, then the columns belong to the default record called RECORD_1.

 

You can load column definitions from a table in the repository, or you can type column definitions into the columns grid. You can also define columns by dragging a table definition from the Repository window to the CFF stage icon on the Designer canvas.

 

Loading columns

The fastest way to define column metadata is to load columns from a table definition in the repository.

 

To load columns:

1.       Click the Records tab on the Stage page.

2.       Click Load to open the Table Definitions window. This window displays all of the repository objects that are in the current project.

3.       Select a table definition in the repository tree and click OK.

4.       Select the columns to load in the Select Columns From Table window and click OK.

5.       If flattening is an option for any arrays in the column structure, specify how to handle array data in the Complex File Load Option window.


Typing columns

You can also define column metadata by typing column definitions in the columns grid.

 

To type columns:

1.       Click the Records tab on the Stage page.

2.       In the Level number field of the grid, specify the COBOL level number where the data is defined. If you do not specify a level number, a default value of 05 is used.

3.       In the Column name field, type the name of the column.

4.       In the Native type field, select the native data type.

5.       In the Length field, specify the data precision.

6.       In the Scale field, specify the data scale factor.

7.       Optional: In the Description field, type a description of the column.

 

Defining record ID constraints

If you are using the CFF stage to read data from a file that contains multiple record types, you must specify a record ID constraint to identify the format of each record.

 

Columns that are identified in the record ID clause must be in the same physical storage location across records. The constraint must be a simple equality expression, where a column equals a value.

 

To define a record ID constraint:

1.       Click the Records ID tab on the Stage page.

2.       Select a record from the Records list.

3.       Select the record ID column from the Column list. This list displays all columns from the selected record, except the first OCCURS DEPENDING ON (ODO) column and any columns that follow it.

4.       Select the = operator from the Op list.

5.       Type the identifying value for the record ID column in the Value field. Character values must be enclosed in single quotation marks.

 

Selecting output columns

By selecting output columns, you specify which columns from the source file the CFF stage should pass to the output links.

 

You can select columns from multiple record types to output from the stage. If you do not select columns to output on each link, the CFF stage automatically propagates all of the stage columns except group columns to each empty output link when you click OK to exit the stage.

 

To select output columns:

1.       Click the Selection tab on the Output page.

2.       If you have multiple output links, select the link that you want from the Output name list.

 

Defining output link constraints

By defining a constraint, you can filter the data on each output link from the CFF stage.

 

You can set the output link constraint to match the record ID constraint for each selected output record by clicking Default on the Constraint tab on the Output page. The Default button is available only when the constraint grid is empty.

 

To define an output link constraint:

1.       Click the Constraint tab on the Output page.

2.       In the ( field of the grid, select an opening parenthesis if needed. You can use parentheses to specify the order in evaluating a complex constraint expression.

3.       In the Column field, select a column or job parameter. (Group columns cannot be used in constraint expressions and are not displayed.)

4.       In the Op field, select an operator or a logical function.

5.       In the Column/Value field, select a column or job parameter, or double-click in the cell to type a value. Enclose character values in single quotation marks.

6.       In the ) field, select a closing parenthesis if needed.

7.       If you are building a complex expression, in the Logical field, select AND or OR to continue the expression in the next row.

8.       Click Verify. If errors are found, you must either correct the expression, click Clear All to start over, or cancel. You cannot save an incorrect constraint.

 

Editing a Complex Flat File stage as a target

To edit a CFF stage as a target, you must provide details about the file that the stage will write, define the record format of the data, and define the column metadata.

 

To edit a CFF stage as a target:

1.       Open the CFF stage editor.

2.       On the Stage page, specify information about the stage data:

a.        On the File Options tab, provide details about the file that the stage will write.

b.       On the Record Options tab, describe the format of the data in the file.

c.        On the Records tab, create or load column definitions for the data

d.       Optional: On the Advanced tab, change the processing settings.

3.       Optional: On the Input page, specify how to write data to the target file:

a.        On the Advanced tab, change the buffering settings.

b.       On the Partitioning tab, change the partitioning settings.

4.       Click OK to save your changes and to close the CFF stage editor.

 

Reject links

The CFF stage can have a single reject link, whether you use the stage as a source or a target.

 

For CFF source stages, reject links are supported only if the source file contains a single record type without any OCCURS DEPENDING ON (ODO) columns. For CFF target stages, reject links are supported only if the target file does not contain ODO columns.

 

You cannot change the selection properties of a reject link. The Selection tab for a reject link is blank.

 

You cannot edit the column definitions for a reject link. For writing files, the reject link uses the input link column definitions. For reading files, the reject link uses a single column named ″rejected″ that contains raw data for the columns that were rejected after reading because they did not match the schema.

Post a Comment

Previous Post Next Post

Contact Form