Dataset
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.parjob
.dev.doc/topics/c_deeref_Structure_of_Data_Sets.html
The evaluation sequence for a Transformer stage
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.parjob
.dev.doc/topics/r_deeref_Guide_to_Using_Transformer_Expressions_and_Stage_Variab.html
APT_TRANSFORM_COMPILE_OLD_NULL_HANDLING
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.parjob
.adref.doc/topics/APT_TRANSFORM_COMPILE_OLD_NULL_HANDLING.html
Data Masking policies
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.ds.entpak.o
pt.doc/topics/r_datamasking_policies_container.html
dsjob
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.cliapi.r
ef.doc/topics/r_dsvjbref_Commands_for_Controlling_WebSphere_DataStage_Jobs.html
Import-Export Projects
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.productizatio
n.iisinfsv.migrate.doc/topics/a_importing_projects.html
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.deploy
.help.doc/topics/importmanproj.html
Array Size, Buffer Size and Record Count Properties
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.conn.ora
con.usage.doc/topics/r_array_size_record_count.html
encrypt.sh
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.found.admin
.common.doc/topics/encrypt_running.html
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.found.admin
.common.doc/topics/encrypt_credfile.html
Real Time Processing
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.intro.d
oc/topics/ds_samples_realtime.html
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.infoservdi
r.user.doc/topics/c_isd_user_ds_qs_job_topologies.html
File Connector
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.7.0/com.ibm.swg.im.iis.conn.filecon.
usage.doc/topics/fileconn_t_designing_jobs.html
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.7.0/com.ibm.swg.im.iis.conn.filecon.
usage.doc/topics/filecon_t_config_parallel_read.html
Data Rules
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.ia.dsrules.do
c/topics/dr_data_rules_stage.html
Data type conversions
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.parjob
.adref.doc/topics/c_deeadvrf_Default_and_Explicit_Type_Conversions.html
Balanced Optimization
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.ds.parjob.de
v.doc/topics/balanceoptimizationworkflow.html
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.parjob
.dev.doc/topics/introductiontobalancedoptimization.html
https://www.ibm.com/support/knowledgecenter/SSZJPZ_8.7.0/com.ibm.swg.im.iis.ds.parjob.dev
.doc/topics/whatbalanceoptimizationdoestoyourjobs.html
Performance Analysis
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.parjob
.adref.doc/topics/c_deeadvrf_Performance_Analysis.html
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.direct.
help.doc/topics/performance_analysis_window.html
Preserve Partitioning Flag
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.3.0/com.ibm.swg.im.iis.ds.parjob
.dev.doc/topics/preservepartitioningflag.html
Environment Variables
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.ds.parjob.ad
ref.doc/topics/r_deeadvrf_Environment_Variables_environment.html
Environment Variables
APT_BUFFER_FREE_RUN
This environment variable is available in the DataStage Administrator, under the Parallel category. It specifies how
much of the available inmemory buffer to consume before the buffer resists. This is expressed as a decimal
representing the percentage of Maximum memory buffer size (for example, 0.5 is 50%). When the amount of data in
the buffer is less than this value, new data is accepted automatically. When the data exceeds it, the buffer first tries to
write some of the data it contains before accepting more. The default value is 50% of the Maximum memory buffer
size. You can set it to greater than 100%, in which case the buffer continues to store data up to the indicated multiple
of Maximum memory buffer size before writing to disk.
APT_BUFFER_MAXIMUM_MEMORY
Sets the default value of Maximum memory buffer size. The default value is 3145728 (3 MB). Specifies the
maximum amount of virtual memory, in bytes, used per buffer.
APT_BUFFER_MAXIMUM_TIMEOUT
DataStage buffering is self tuning, which can theoretically lead to long delays between retries. This environment
variable specified the maximum wait before a retry in seconds, and is by default set to 1.
APT_BUFFERING_POLICY
This environment variable is available in the DataStage Administrator, under the Parallel category. Controls the
buffering policy for all virtual data sets in all steps. The variable has the following settings:
AUTOMATIC_BUFFERING (default). Buffer a data set only if necessary to prevent a data flow deadlock.
FORCE_BUFFERING. Unconditionally buffer all virtual data sets. Note that this can slow down processing
considerably.
NO_BUFFERING. Do not buffer data sets. This setting can cause data flow deadlock if used inappropriate
APT_DECIMAL_INTERM_PRECISION
Specifies the default maximum precision value for any decimal intermediate variables required in calculations.
Default value is 38.
APT_DECIMAL_INTERM_SCALE
Specifies the default scale value for any decimal intermediate variables required in calculations. Default value is 10.
APT_CONFIG_FILE
Sets the path name of the configuration file. (You may want to include this as a job parameter, so that you can
specify the configuration file at job run time).
APT_DISABLE_COMBINATION
Globally disables operator combining. Operator combining is DataStage’s default behavior, in which two or more
(in fact any number of) operators within a step are combined into one process where possible. You may need to
disable combining to facilitate debugging. Note that disabling combining generates more UNIX processes, and
hence requires more system resources and memory. It also disables internal optimizations for job efficiency and run
times.
APT_EXECUTION_MODE
By default, the execution mode is parallel, with multiple processes. Set this variable to one of the following values
to run an application in sequential execution mode:
ONE_PROCESS one-process mode
MANY_PROCESS many-process mode
NO_SERIALIZE many-process mode, without serialization
APT_ORCHH
Must be set by all DataStage Enterprise Edition users to point to the top-level directory of the DataStage Enterprise
Edition installation.
APT_STARTUP_SCRIPT
As part of running an application, DataStage creates a remote shell on all DataStage processing nodes on which the
job runs. By default, the remote shell is given the same environment as the shell from which DataStage is invoked.
However, you can write an optional startupshell script to modify the shell configuration of one or more processing
nodes. If a startup script exists, DataStage runs it on remote shells before running your application.
APT_STARTUP_SCRIPT specifies the script to be run. If it is not defined, DataStage searches ./startup.apt,
$APT_ORCHHOME/etc/startup.apt and $APT_ORCHHOME/etc/startup, in that order.
APT_NO_STARTUP_SCRIPT disables running the startup script.
APT_NO_STARTUP_SCRIPT
Prevents DataStage from executing a startup script. By default, this variable is not set, and DataStage runs the
startup script. If this variable is set, DataStage ignores the startup script. This may be useful when debugging a
startup script. See also APT_STARTUP_SCRIPT.
APT_STARTUP_STATUS
Set this to cause messages to be generated as parallel job startup moves from phase to phase. This can be useful as a
diagnostic if parallel job startup is failing.
APT_MONITOR_SIZE
This environment variable is available in the DataStage Administrator under the Parallel branch. Determines the
minimum number of records the DataStage Job Monitor reports. The default is 5000 records.
APT_MONITOR_TIME
This environment variable is available in the DataStage Administrator under the Parallel branch. Determines the
minimum time interval in seconds for generating monitor information at runtime. The default is 5 seconds. This
variable takes precedence over APT_MONITOR_SIZE.
APT_NO_JOBMON
Turn off job monitoring entirely.
APT_PM_NO_SHARED_MEMORY
By default, shared memory is used for local connections. If this variable is set, named pipes rather than shared
memory are used for local connections. If both APT_PM_NO_NAMED_PIPES and
APT_PM_NO_SHARED_MEMORY are set, then TCP sockets are used for local connections.
APT_PM_NO_NAMED_PIPES
Specifies not to use named pipes for local connections. Named pipes will still be used in other areas of DataStage,
including subprocs and setting up of the shared memory transport protocol in the process manager.
APT_RECORD_COUNTS
Causes DataStage to print, for each operator Player, the number of records consumed by getRecord() and produced
by putRecord(). Abandoned input records are not necessarily accounted for. Buffer operators do not print this
information.
APT_NO_PART_INSERTION
DataStage automatically inserts partition components in your application to optimize the performance of the stages
in your job. Set this variable to prevent this automatic insertion.
APT_NO_SORT_INSERTION
DataStage automatically inserts sort components in your job to optimize the performance of the operators in your
data flow. Set this variable to prevent this automatic insertion.
APT_SORT_INSERTION_CHECK_ONLY
When sorts are inserted automatically by DataStage, if this is set, the sorts will just check that the order is correct,
they won't actually sort. This is a better alternative to shutting partitioning and sorting off insertion off using
APT_NO_PART_INSERTION and APT_NO_SORT_INSERTION.
APT_DUMP_SCORE
Configures DataStage to print a report showing the operators, processes, and data sets in a running job.
APT_PM_PLAYER_MEMORY
Setting this variable causes each player process to report the process heap memory allocation in the job log when
returning.
APT_PM_PLAYER_TIMING
Setting this variable causes each player process to report its call and return in the job log. The message with the
return is annotated with CPU times for the player process.
OSH_DUMP
If set, it causes DataStage to put a verbose description of a job in the job log before attempting to execute it.
OSH_ECHO
If set, it causes DataStage to echo its job specification to the job log after the shell has expanded all arguments.
OSH_EXPLAIN
If set, it causes DataStage to place a terse description of the job in the job log before attempting to run it.
OSH_PRINT_SCHEMAS
If set, it causes DataStage to print the record schema of all data sets and the interface schema of all operators in the job log.
APT_STRING_PADCHAR
Overrides the pad character of 0x0 (ASCII null), used by default when DataStage extends, or pads, a string field to a fixed length.