Using email bots for Natural Language Processing and Machine Learning

From PegaWiki
This is the approved revision of this page, as well as being the most recent.
Jump to navigation Jump to search

Description Using Natural Language Processing and Machine Learning to optimize email bots
Version as of 8.4
Application Platform
Capability/Industry Area Conversational Channels

Email bots allow a user to take advantage of some rich Natural Language Processing (NLP) and Machine Learning (ML) capabilities to effectively classify and extract important data from emails. This information can then be used to decide the best course of action, whether it’s routing it to the correct team, creating a case, sending a response, or just making a suggestion to the person responding. All of this can be done without being experts in data science or needing to write code.

Consider the correct type of entity[edit]

It’s important to consider the correct type of entity to use before you start training because it may be much easier and more accurate to just use a RUTA or Keyword entity type rather than training a Machine Learning based entity. However, there is a lot of power in being able to extract data using machine learning so this article will mostly focus on how you can take advantage of the machine learning capabilities to train models for identifying topics and entities.

Have an overview of the scope[edit]

It’s good to have an overview of the scope of your bot before you get started. If it’s broad and the different outcomes are relatively distinct than you will likely need less training data to draw a distinction between them. However, if the bot is doing several things that are all very similar (like different kinds of address changes) then you will need more training data to differentiate them. If they are very similar it may be better to handle that differentiation at the case level rather than trying to determine the correct case type. For example, it may be better to have a flag on the case for differentiating between commercial and residential address change, rather than trying to get the NLP to make that distinction.

Start Training[edit]

Once you have an idea of the initial scope of your bot you can create start training your topics and entities. One easy way to start this training is using the Training Data tab of the email channel. (Image 2). This will show the text of emails that have already been triaged if you have “Record training data” turned on for the channel. You can also easily add new entries manually (animation 1). It’s often easiest to just think about different ways your users will initiate an action and quickly add them for each topic.

Training NLP model data using the Training Data screen in App Studio.
Training NLP model data using the Training Data screen in App Studio.

Now that you have data on the training data tab you can select an item, choose the appropriate topic, highlight and right click to add entities, then mark the record as reviewed. Start with 20 entries for each topic and entity as a baseline. At any point you can build the model and start testing it directly from the training data screen. This allows you to quickly iterate through the process of building an intelligent bot and evaluating its accuracy.

There are other ways to build and enhance your model using Pega’s Prediction Studio but those are more complex and will be covered in another article.