Updating columns in a table without erasing data in other columns in the same row

From PegaWiki
This is the approved revision of this page, as well as being the most recent.
Jump to navigation Jump to search

Updating columns in a table without erasing data in other columns in the same row

Description Using the merge shape in Data flow to update table columns without affecting existing content
Version as of 8.5
Application Pega Customer Decision Hub
Capability/Industry Area Data Management

Business use case

Prior to deploying Pega Customer Decision Hub, a client used to load data specific to each campaign in a separate table.  Since that client ran multiple campaigns, there were well over 100 ad-hoc tables in the client’s legacy system.  Since the star schema is not suitable for so many ad-hoc tables, the solution was to create one table called adhocfield table. Data is inserted into the adhocfield table and columns are dynamically added to the table based on campaign needs.

For example, if the Campaign 1 requires ten ad-hoc data fields, ten columns will be added to the table.  The file will be uploaded to Pega Customer Decision Hub, and Pega Customer Decision Hub will:

1) Create ten properties

2) Expose ten columns in an ad-hoc table

3) Insert data into those populated ten columns

When data has been imported, it can be used for Campaign 1.

When Campaign 2 needs to be fielded, an additional set of 5-10 ad-hoc columns may be created and data inserted into those columns, if necessary.

Note: It is necessary to limit the total number of columns in adhocfield table to under 500 to minimize the impact on database performance.  For this purpose, the columns that are not required may be deleted periodically.

Technical context

Data flow can be configured using out-of-the-box database table dataset save functionality. This functionality provides you with two options:

1) Only insert new records

2) Insert new and overwrite existing records

Data flow configuration in Pega Customer Decision Hub showing the options of inserting records.
Out of the Box database table dataset save functionality in Pega Customer Decision Hub data flow.

When the Insert new and overwrite existing records option is selected, columns 1-5 will be overwritten to NULL when users try to insert values for columns 6-10. In other words, this option overwrites the entire row instead of updating the row, as shown in the illustration below:

Pega Customer Decision Hub data flow showing how to create a workaround to overwrite existing data in a data table.
Overwriting existing data in adhocfield table

Key question

How can you update the columns 6 to 10 without wiping out the data in columns 1 to 5, when using the option to Insert new with overwrite existing record?

Proposed workaround solution:

Receiving and updating partial payload (this is, updating partial columns of any single row of database table) is not a recommended option in OOTB[SM1] [SZ2] [SZ3] [SZ4] [SM5] [SM6] [SM7] [SZ8] [SZ9] . Merge Shape [SM10]  is an out-of-the-box (OOTB) utility in the data flow rule which can merge two records into one and [SZ11] can help us in this situation. The Merge Shape utility will result in read on a given record before executing an update on the same record via destination dataset option as shown in the image below:

Data flow showing the usage of merge shape configuration for the adhocfield table.
Merge Shape Configuration in a data flow

Merge shape configuration settings as shown in the data flow:

Data flow window showing the correct settings for the merge shape workaround for adhocfield table.
Merge Shape Configuration