You can perform a bulk data transformation (BDT) either from the original transformation source database or from its clone. It is recommended that you perform a BDT from a cloned database to minimize disruption of the transformation source.
You can perform a BDT from an Enterprise Database Server to a SQL Server, an Oracle database, or a Kafka distributed streaming platform.
If the transformation target is Kafka, then Data Exchange acts as a Kafka Producer. It uses XML Schema Definition (XSD) to provide structure to the Kafka messages. To understand this structure and be able to create Kafka Consumers, you can generate Helper API classes in DDW. For more information on the structure of Kafka messages, refer to Data Exchange Development Workbench Administration and Operations Help.
You must also configure the Producer configurations, as explained later in this section.
Before performing the BDT, ensure the following:
AISMCP is installed and configured on the MCP host.
AIS Configuration Utility is installed on your Windows machine where the runtime components are installed, and a connection is defined in this utility so that the Runtime Windows components can communicate with the MCP host where AISMCP is installed.
An audit file switch is performed at the point of cloning the database, if you are using a clone database.
The Record Sequence Number (RSN) data item is not declared directly after a Group data item in the Data and Structure Definition Language (DASDL). If you do so, the value of the RSN data item is lost while doing the BDT.
The DATAEXCHANGE setting in the Audit Trail Specification of the DASDL or in the Database DEFAULTS or at the data set level. If you set this at the data set level for schemas that are already loaded in the DDW or for existing Transformations, then you must do a DM Control Update on the MCP host, and then reload the schemas.
For an Oracle database as the transformation target, if two source data sets are mapped to the same target table and the target table has a primary key or a unique key, and if the DML command substitution is Default, then the Oracle user must have Insert and Update privileges.
Notes:
It is recommended that the structure format timestamps be retained in the clone database.
It is recommended that the concerned tables in the transformation target are empty before starting a BDT for the first time. This avoids data duplication or transformation failure during the BDT.
Linked data sets, Remapped data sets, and Internal LOB data types are not supported in a BDT.
To start a BDT
Click Manage.
The hosts associated with the deployed transformations appear in the Source Database Hosts pane.
In the Source Database Hosts pane, expand MCP Hosts, and then expand the required host to view the Enterprise Database Server databases hosted on it.
Select the required Enterprise Database Server database from which you want to start the BDT.
Point to Start, and then click Bulk Data Transformation.
The Data Exchange Transformation Session Setup appears.
On the Transformations tab, select the transformation(s) you want to run, and then click Next.
The General Settings tab appears.
On the General Settings tab, enter or change the following information, as appropriate:
The Agent TCP Port box displays the default port number (8299) on which the Agent is listening. You must change the TCP port number only if you are not using the default port number.
In the AIS Connection Name box, type the AIS Connection Name that you had defined using the AIS Configuration Utility.
If you need a secure connection from source database to Runtime Service, enable secure connection in AIS, and then enter the corresponding Connection Name.
In the Login User and Password boxes, type the usercode and password, respectively.
This usercode (if not a dbowner), must have the privileges to access the dmsupport file, control files, and the description file of the source database.
Note: If User is a non-privileged user, you must change the security setting of the dmsupport file, control file, and description file from PRIVATE to PUBLIC.
In the Enterprise Database Server, a usercode might be associated with either an accesscode or a chargecode, or both. An accesscode will have an associated password.
The following table displays the format in which the usercode and the associated accesscode or chargecode need to be entered in the Login User box along with their corresponding passwords. Each entry is separated from the other by a separator (/).
User | Password |
Usercode/Accesscode | Usercode password/Accesscode password |
Usercode/Accesscode/Chargecode | Usercode password/Accesscode password |
Usercode/Chargecode | Usercode password |
The Runtime Service Host box displays the host name or IP address of the Windows partition where the Runtime Service is installed.
You must change the value of the Runtime Service host if both the Runtime Service and the Administrative Service are not running on the same Windows host.
The Runtime Service TCP Port box displays the default TCP port number (9299) of the Runtime Service that was configured during installation.
If you require a secured connection, turn on Secure Connection. This enables you to configure a secure connection from Runtime Service to the target database.
For Oracle as the target, also provide the Secure Connection Port number. The default secure connection port is 2484.
If your target is a Kafka streaming platform, then the Kafka target(s) box displays the list of Kafka targets and Producer configurations.
Configure the following Producer configurations:
Target(s): This displays the name of the Kafka target(s) in the format <transformation>\<schema name>. Each XSD schema loaded in a transformation is considered as a Kafka target. Loading the same schema twice with different name or dragging the same schema to two different transformations (in DDW) will result in two Kafka targets.
Bootstrap Server(s): Enter the bootstrap server name(s) and the port number(s) in the format host1:port1,host2:port2. Only letters, numbers, hyphen(-), underline(_), and period(.) are allowed in the Bootstrap server name.
Topic: Enter the name of the topic that you have created. The records are published to this topic.
Acks: Select the kind of acknowledgement you require:
-1 - Acknowledgements from replicas: The leader will wait for the full set of in-sync replicas to acknowledge the record.
0 - No acknowledgement: The producer will not wait for any acknowledgment from the server.
1 - Acknowledgement from leader: The leader will write the record to its local log and respond without waiting for full acknowledgement from all the followers.
Client ID: This is the ID used to identify the Kafka target. The default client ID is Data Exchange.
Secure Connection: To configure the Data Exchange Runtime service host to support an SSL connection with the Kafka target host, you can enable Secure Connection.
When you turn on Secure Connection, the Secure Connection Config window appears. In this window, browse and select the Certificate Authority file. You can also provide additional Certificate Configuration details such as certificate, key, and key password if the Kafka server requires client authentication.
Notes:
If the key file is not encrypted, you can leave the password blank. If the key file is encrypted, you must pair the file with the password.
For all the files selected in secure connection for Kafka targets, both the certificate and key files must be in PEM format.
Other Producer Config: If you need to provide additional Producer configurations, enter the values here in the format name1:value1, name2:value2. These configurations override the default Producer configurations. For the complete list of Producer configurations refer to Kafka Producer Configurations.
The Target Commit Error Handling Policy lists the errors that could occur while committing the records to the target.
On encountering some of the listed errors, you can only choose to stop the transformation or ignore the error. But there is a possibility to recover from the Deadlock Error by retrying. Hence this error has the option to retry 3 times. This recovery is faster, as the retry is internal to the Runtime Service.
Select the manner in which you want Data Exchange to respond when the specific error occurs.
Click Next.
The Transformation Settings tab appears.
On the Transformation Settings tab, enter or change the following information, as appropriate:
In the Host Name box, type the required MCP host name.
Note: If you provide a private IP address, then the host is considered as the original host; if you provide a public IP address or the host name, then the host is considered as the cloned host.
In the Database Name box, type the transformation source database name in the (USER_ID)DB_NAME ON PACK_NAME or *DB_NAME ON PACK_NAME format.
USER_ID is the user code of the transformation source database.
DB_NAME is the name of the transformation source database.
PACK_NAME is the name of the pack the transformation source database is hosted on.
In the Description File box, type the description file name in the (USER_ID) ON PACK_NAME or * ON PACK_NAME format.
From the Select Record drop-down list, select one of the following values:
With Update Lock: This is the default selection if you are performing a transformation from the original database. In this case, no data updates happen to the source tables during a BDT. This ensures data integrity. You also need not specify the starting point for a CDT.
With No Lock: Select this option if it is critical that the source tables be updated with the current data during the BDT. If you select this option, you must specify the starting point for a CDT. It is possible that data integrity is compromised if you select this option.
Note: The Select Record option is available only if you are using the original transformation source database. If you are using a cloned database, this option is disabled as it is not required to lock the table of the cloned database while performing the transformation.
In the Maximum Degree of Parallelization box, type the maximum number of threads that can run in parallel to perform the BDT. This value depends on the number of CPU cores and the available RAM of the machine where Runtime Service is running.
Note: To avoid performance issues, this value must not exceed the number of CPU cores of the machine where the Runtime Service resides.
Click Next.
The Start Transformations tab appears.
Verify all the details, and then click Start.
If the BDT fails to start, the error is displayed on the Bulk Data Transformation Status box. Click the icon in the Details column for more information on the cause and resolution of the error. The transformation starts after you have resolved all the errors.
Note: If SecuritySupport is configured in MCP, then Data Exchange calls the command LogonCheck while checking the format timestamp before performing a BDT. LogonCheck captures logon attempts for all MCP usercodes. For more information on LogonCheck, refer to the *SYMBOL/SECURITYSUPPORT file.
When the BDT starts, the status of the Bulk Data Transformation on the Summary tab changes to Running. Click Refresh on the Summary tab to view the updated status. Click Monitor on the Summary tab to see the statistical data corresponding to the BDT.