Here is a list of some important Datastage Environment Variables -
APT_BUFFER_FREE_RUN
This
environment variable is available in the DataStage Administrator, under the Parallel
category. It specifies how much of the available inmemory buffer to consume
before the buffer resists. This is expressed as a decimal representing the
percentage of Maximum memory buffer size (for example, 0.5 is 50%). When the
amount of data in the buffer is less than this value, new data is accepted
automatically. When the data exceeds it, the buffer first tries to write some
of the data it contains before accepting more. The default value is 50% of the
Maximum memory buffer size. You can set it to greater than 100%, in which case
the buffer continues to store data up to the indicated multiple of Maximum
memory buffer size before writing to disk.
APT_BUFFER_MAXIMUM_MEMORY
Sets
the default value of Maximum memory buffer size. The default value is 3145728
(3 MB). Specifies the maximum amount of virtual memory, in bytes, used per
buffer.
APT_BUFFER_MAXIMUM_TIMEOUT
DataStage
buffering is self tuning, which can theoretically lead to long delays between retries. This environment
variable specified the maximum wait before a retry in seconds, and is by
default set to 1.
APT_BUFFERING_POLICY
This
environment variable is available in the DataStage Administrator, under the
Parallel category. Controls the buffering policy for all virtual data sets in
all steps. The variable has the following settings:
AUTOMATIC_BUFFERING (default). Buffer a data set only if necessary to prevent a
data flow deadlock.
FORCE_BUFFERING. Unconditionally buffer all virtual data sets. Note that this
can slow down processing considerably.
NO_BUFFERING. Do not buffer data sets. This setting can cause data flow
deadlock if used inappropriately.
APT_DECIMAL_INTERM_PRECISION
Specifies
the default maximum precision value for any decimal intermediate variables
required in calculations. Default value is 38.
APT_DECIMAL_INTERM_SCALE
Specifies
the default scale value for any decimal intermediate variables required in
calculations. Default value is 10.
APT_CONFIG_FILE
Sets
the path name of the configuration file. (You may want to include this as a job
parameter, so that you can
specify
the configuration file at job run time).
APT_DISABLE_COMBINATION
Globally
disables operator combining. Operator combining is DataStage’s default
behavior, in which two or more (in fact any number of) operators within a step
are combined into one process where possible. You may need to disable combining
to facilitate debugging. Note that disabling combining generates more UNIX
processes, and hence requires more system resources and memory. It also
disables internal optimizations for job efficiency and run times.
APT_EXECUTION_MODE
By
default, the execution mode is parallel, with multiple processes. Set this
variable to one of the following values to run an application in sequential
execution mode:
ONE_PROCESS one-process mode
MANY_PROCESS many-process mode
NO_SERIALIZE many-process mode, without serialization
APT_ORCHHOME
Must
be set by all DataStage Enterprise Edition users to point to the top-level
directory of the DataStage Enterprise Edition installation.
APT_STARTUP_SCRIPT
As
part of running an application, DataStage creates a remote shell on all
DataStage processing nodes on which the job runs. By default, the remote shell
is given the same environment as the shell from which DataStage is invoked.
However, you can write an optional startupshell script to modify the
shell configuration of one or more processing nodes. If a startup script
exists, DataStage runs it on remote shells before running your application. APT_STARTUP_SCRIPT
specifies the script to be run. If it is not defined, DataStage searches ./startup.apt,
$APT_ORCHHOME/etc/startup.apt and $APT_ORCHHOME/etc/startup, in that
order. APT_NO_STARTUP_SCRIPT disables running the startup script.
APT_NO_STARTUP_SCRIPT
Prevents
DataStage from executing a startup script. By default, this variable is not
set, and DataStage runs the startup script. If this variable is set, DataStage
ignores the startup script. This may be useful when debugging a startup script.
See also APT_STARTUP_SCRIPT.
APT_STARTUP_STATUS
Set
this to cause messages to be generated as parallel job startup moves from phase
to phase. This can be useful as a diagnostic if parallel job startup is
failing.
APT_MONITOR_SIZE
This
environment variable is available in the DataStage Administrator under the
Parallel branch. Determines the minimum number of records the DataStage Job
Monitor reports. The default is 5000 records.
APT_MONITOR_TIME
This
environment variable is available in the DataStage Administrator under the
Parallel branch. Determines the minimum time interval in seconds for generating
monitor information at runtime. The default is 5 seconds. This variable takes precedence over
APT_MONITOR_SIZE.
APT_NO_JOBMON
Turn
off job monitoring entirely.
APT_PM_NO_SHARED_MEMORY
By
default, shared memory is used for local connections. If this variable is set,
named pipes rather than shared memory are used for local connections. If both
APT_PM_NO_NAMED_PIPES and APT_PM_NO_SHARED_MEMORY are set, then TCP sockets are
used for local connections.
APT_PM_NO_NAMED_PIPES
Specifies
not to use named pipes for local connections. Named pipes will still be used in
other areas of DataStage, including subprocs and setting up of the shared
memory transport protocol in the process manager.
APT_RECORD_COUNTS
Causes
DataStage to print, for each operator Player, the number of records consumed by
getRecord() and produced by putRecord(). Abandoned input records are not
necessarily accounted for. Buffer operators do not print this information.
APT_NO_PART_INSERTION
DataStage
automatically inserts partition components in your application to optimize the
performance of the stages in your job. Set this variable to prevent this
automatic insertion.
APT_NO_SORT_INSERTION
DataStage
automatically inserts sort components in your job to optimize the performance
of the operators in your data flow. Set this variable to prevent this automatic
insertion.
APT_SORT_INSERTION_CHECK_ONLY
When
sorts are inserted automatically by DataStage, if this is set, the sorts will
just check that the order is correct, they won't actually sort. This is a
better alternative to shutting partitioning and sorting off insertion off using
APT_NO_PART_INSERTION and APT_NO_SORT_INSERTION.
APT_DUMP_SCORE
Configures
DataStage to print a report showing the operators, processes, and data sets in
a running job.
APT_PM_PLAYER_MEMORY
Setting
this variable causes each player process to report the process heap memory
allocation in the job log when returning.
APT_PM_PLAYER_TIMING
Setting
this variable causes each player process to report its call and return in the
job log. The message with the return is annotated with CPU times for the player
process.
OSH_DUMP
If
set, it causes DataStage to put a verbose description of a job in the job log
before attempting to execute it.
OSH_ECHO
If
set, it causes DataStage to echo its job specification to the job log after the
shell has expanded all arguments.
OSH_EXPLAIN
If
set, it causes DataStage to place a terse description of the job in the job log
before attempting to run it.
OSH_PRINT_SCHEMAS
If
set, it causes DataStage to print the record schema of all data sets and the
interface schema of all operators in the job log.
APT_STRING_PADCHAR
Overrides
the pad character of 0x0 (ASCII null), used by default when DataStage extends,
or pads, a string field to a fixed length.