Ensuring data integrity for externally sourced data
|Description||Using external data in a Pega application|
|Version as of||8.4|
|Capability/Industry Area||Data Integration|
The Pega Live Data layer, data pages, allows Pega Platform applications to interact with data that is stored in external applications without tightly coupling the business logic with the specifics of the integration. When using external data in your application, special thought and consideration need to be given to handling errors in the interface or within the external data itself. It is important to build your application so that these errors can be detected and handled.
Always use the correct rules within your Pega application to interact with external data. Any time you are sourcing or persisting data to a third-party system, use data pages to provide the intermediate layer between your application’s internal data and the external system’s data.
When using externally sourced data within a Pega case, define a data relationship between the case and data object. This will allow you to easily leverage that data throughout the case as well as make it clearer for future users that a relationship exists between the case and data object. This is particularly important if the state of the case depends on successful access or persistence of this data because data relationships enable you to treat the data as part of the case, for example, for exception handling and data validation.
Consider and plan to handle external data errors in the following three scenarios: when you retrieve data from an external system, when you use data from an external system in your case processing, and when you persist data to an external system.
Sourcing data from an external system
Data source errors occur during the execution of an external data source that cause it to return without the requested data or with incorrect, flawed, or missing data. Examples of data source errors include:
- A source system or database is down and the connection times out.
- A request passed to the external system using the connector is invalid.
- Credentials are invalid or are not authorized to get the data requested.
- An internal system error occurred either during load or on the external system.
Error handling in these situations should be done in the data page, in the response data transform of the data source. The response data transform is executed after the call to the external system has been executed, and maps part or all of the information returned from the data source to the data page. The information returned from the data source will be on a page type parameter named “DataSource”.
Use the data transform pxErrorHandlingTemplate as an example. This data transform has sample actions that you can use for common data source error-handling tasks, including:
- Read page messages on the data page.
- Clear page messages from the data page so it does not stop work processing.
- Throw an invocation error by adding page messages so case processes pick it up (remember to catch and handle it in your case layer).
- Output messages to the system log.
- Send an email to a system administrator or other stakeholder.
You can either copy the logic from the template into your response data transform or save the template to your class and ruleset to create your own error-handling transform that can be reused across sources. All rows start in a disabled state, so remember to right-click and enable actions that you want to use and customize.
You can also pull in data from a different data page instead of using the normal DataSource page when it has errors. Examples of this include:
- Referencing another data page to pull data from an alternative data source
- Referencing a separate data page with the same data source to use for retry
For more information, see Errors in data sources.
Using data from an external system
When you reference a data page within your application, you need to account for the possibility that a data error occurred when the data was sourced. Add the error-handling logic to your case at the point in the process where you are referencing the data. Due to the declarative nature of data pages, you might not be sure when the data page was originally populated.
You should build conditional logic in the case layer to handle external data errors. Generally, this logic should be built into whatever rules reference the data page or auto-populated property, such as:
- Flow action post-processing transform or activity
- Data transform or activity referencing the data
- Decision shape followed by a utility step in a flow
- A defer load activity on a section with a page context of the data page or auto-populated property
Use the following tools to handle the errors in the locations mentioned above:
- Apply the hasMessages when rule to the page that you are trying to use. This when rule checks for the existence of any invocation errors that resulted in page messages (see Sourcing Data above), and any error-handling logic that you include after the when rule will execute only in the case of an invocation error.
- When data is pulled into the case using an auto-populated property, use the function @Default.getMessagesAll() in the context of that page property to get all invocation errors that occurred as text separated by new line characters. You can then write error-handling logic based on what occurred.
Design your error handling logic according to the business need and the error that occurred. Three situations to consider are: when the error is fixable by the user; when the error is not fixable by the user but processing can be allowed to continue anyway (for example, if the data is not mandatory for the process); and when the error is not fixable by the user, and processing must be stopped.
For instance, a data page might have failed to retrieve data about a customer based on an invalid customer ID value passed to an external service. In this case, the user can correct that if you provide them with a UI that displays the error clearly, and allows the user to correct the ID and resubmit. Alternatively, the data contained on that data page might have been informational but not vital to the business process, in which case you can show the error to the user but allow the user or the system to continue processing by clearing any page messages. Finally, if the error should prevent further case processing, you must ensure that the invocation and error-trapping logic occur, and avoid presenting the user with a screen where they cannot proceed and have no means to correct the error. This can be done, for example, by invoking the data page and handling the error logic as a post-processing step on the previous flow action rather than directly on the screen where the data must be used.
For more information, see Understanding invocation errors.
Persisting data to an external system
Savable data pages offer several important tools in handling the persistence of data to external systems:
- List the data pages that will be updated on your Flow Actions to give you one place to manage multiple external system updates.
- Use a Validate rule on each data object to ensure that the data is complete and valid before attempting to update the external system.
- You should define data relationships to reference these data objects within your case. You can choose to store a copy of the data in your case or only access it by reference, but either way you will be able to use the settings within your savable data page to persist the data to the external system.
From a case processing perspective, you should treat external data persistence errors in the same way you would internal data errors. If the error is correctable by the user, show them the error and allow them to correct it. If not, use ProblemFlow to stop further case processing and route the case appropriately for further review or retry logic.
When more than one system or more than one step of persistency is required, carefully plan your error handling. The error handling in this situation must account for the possibility of an error with any of the individual transactions, the availability of compensating actions (such as rollback APIs) for any transactions that had already completed, and whether the case can proceed and how.
For example, you may have a flow action that collects customer data that then must be persisted to two different external systems. If the first update succeeds but there is a failure on the update to the second system, you need to build error-handling logic based on the available options. One simple option is to retry the failed update until it succeeds. If it fails repeated retry attempts, there might be an option to call another service on the previously updated system to delete the newly sent data. You should be thoughtful about the ordering of your update operations, based on available rollback options and relative likelihood of errors for your various external systems.
For more information, see Savable data pages.