Configuring search for Pega Knowledge

From PegaWiki
This is the approved revision of this page, as well as being the most recent.
Jump to navigation Jump to search

Configuring search for Pega Knowledge

Description Instructions on how to set up the search functionality in Pega Knowledge
Version as of 8.5
Application Pega Customer Service
Capability/Industry Area Knowledge Management Application



All Pega applications, including Pega Customer Service and Pega Knowledge, use the Elasticsearch engine. The following instructions provide various configuration options from Pega Knowledge and Pega Customer Service versions 8.3 and later.

pySearchMethod[edit]

The Knowledge Management (KM) article search method is defined by a property, pySearchMethod, which specifies how the search will be conducted, either with an EXACT, CONTAINS, or FUZZY parameter.

The pySearchMethod parameter is specified in different rules that support the Interaction portal search and the help site:

  • For the Customer Service Interaction portal KM searches, the default search configuration is specified in the Data transform rule KMSearchParams. To override any out-of-the-box configuration, use the extension point data transform rule KMSearchParamsExtn.
  • For the Pega Knowledge KM help site searches, the search settings set the activity KMSearchRDHelpSite to pass them to the search activity pxRetrieveSearchData.

Available search types[edit]

You can use the following search types: Exact, Contains, and Fuzzy.

Exact search[edit]

The Exact search does an exact match of the keyword and the content present in the properties that we are searching for. For example. if you are searching for Male , it returns all articles in which the exact word Male is present, without the articles which have female or males as they are not exact match.

Contains search[edit]

The Contains search checks if the searched words are present in any word of the article (partial or match). For example, if you are searching for male, the search brings results which contain the word FEMALE or MALES. Note that you cannot use the Fuzzy search when the Contains search method is configured.

Fuzzy search[edit]

The Fuzzy search does an exact search with the only difference being that it finds all permutations and combinations of the search key words based on the pyPrefixLength of the Fuzzy search settings. For example, if you are searching for foot with the prefix length set to 3, the search checks for the first three letters of the search term to be an exact match and then does all permutations and combinations of the characters after the first three letters. So for the search term foot, the search provides all results that have fool, food, foot, foos, and so on.

pyPrefixLength plays a critical role in the results that we get when we do the Fuzzy search. We recommend pyPrefixLength to be set to a value of 3 (minimum) or greater.

For more information about how Elasticsearch implements Fuzzy search queries, go to the Elasticsearch website at elastic.co.

Fuzzy search configuration[edit]

The data page D_FuzzyConfig provides the following settings.

Degree of fuzziness[edit]

The Degree of fuzziness field that selects the edit distance, that is, the search string length within which the approximate matching of characters occurs. If you select Auto, the maximum edit distance is:

  • 0 for strings of one or two characters.
  • 1 for strings of three, four, or five characters.
  • 2 for strings of more than five characters.

For example, performing a fuzzy search query for the term catchr with a degree of fuzziness of 1 finds matches like catch (by deleting one character) and catcher (by adding one character), but does not find matches like catches (by adding one character and replacing one, which adds up to an edit distance of 2).

Prefix length[edit]

In the Prefix length field, you enter the number of initial characters in the string to which fuzzy matching does not apply, given that the initial characters match exactly. Increasing this parameter value results in faster search queries.

For example, performing a fuzzy search query for the term windwo with a prefix length of 3, does not apply fuzzy search to the first three characters win, and applies fuzzy search to the rest of the characters to find matches like window, winter, winner, and so on.

Maximum expansion terms[edit]

In the Maximum expansion terms field, you enter the number of alternative spellings for the search string that you want to allow while searching. Decreasing this parameter value results in faster search queries, but might not return as many potential matches.

For example, performing a fuzzy search query for the term codngi with 2 maximum expansion terms, finds only two matches like code and coding.

In Pega Knowledge, these settings are referred to using the data page D_FuzzyConfig that is populated by using the data transform FuzzySearchConfig. You update this data transform to change the settings above.

Searching in attachments[edit]

Elasticsearch gives you an option to search in work attachments if these attachments contain text data.

  1. On the search landing page, select the Index work attachments check box, and then index the work objects. Typical file attachments could be MS Office documents, such as .ppt, .docx, .xlsx, or .pdf files. Alternate text that is defined on images will be indexed, but not the images themselves.SearchConfiguration2.jpg
  2. In the search settings, set pyIncludeAttachments to true.

Note: By default, the attachment threshold size is set to a maximum of 5 MB. Do not change this setting to avoid potential performance issues with returning documents larger than 5 MB. Any documents attached to an article that are larger than 5 MB are excluded from indexing.

Number of search results[edit]

The number of search results is controlled in the D_KMApplicationSettings data page rule. The property KMmaxhelpsitesearchResults contains the total number of results that need to be fetched through Elasticsearch.

Sorting search results[edit]

Elasticsearch provides a rank for search results in the property pzRanking. This functionality is available on pySearchResultsWork. The technical implementation team can sort the results based on the pzRanking property or any other property in the KMSearchRD activity in the OBJ-Sort step.

Boosting search results[edit]

Certain article attributes, such as the title, abstract, and tag, can be boosted to have more prevalence in the search results. For example, if you want articles that have a search match in the article title, you can boost the article title attribute so any article with that search term is returned at the top of the search results.

You can configure this boost setting in the KM Portal, on the Configurations > Search menu.

After enabling the boost search and adding the necessary properties and the relevant boost score, the settings are reflected in the search sort order.