Difference between revisions of "Queuing and routing customer requests in Pega Chat"

From PegaWiki
Jump to navigation Jump to search
(Made several changes to punctuation and spelling (removed hyphen from wait time, for example). The preferred spelling is "queuing," but the title cannot be edited.)
Tag: Visual edit
Line 1: Line 1:
{{New request
+
{{Design pattern|Title=Queuing and routing customer requests in Pega Chat|Description=Recommendations for balancing chat queues and agent workloads.|Version=8.4|Applications=Pega Chat|Capability Area=Customer service|Owner=Don't know}}
 
 
|Request to Publish=Yes
 
 
 
|Curator Assigned=
 
 
 
}}
 
↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓  '''<big>Please Read Below</big>'''  ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓
 
 
 
Enter your content below. Use the basic wiki template that is provided to organize your content.  After making your edits, add a summary comment that briefly describes your work, and then click "SAVE". To edit your content later, select the page from your "Watchlist" summary. If you can not find your article, search the design pattern title.
 
 
 
When your content is ready for publishing, next to the '''"Request to Publish"''' field above, type '''"Yes"'''. A Curator then reviews and publishes the content, which might take up to 48 hours.
 
 
 
↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓ '''<big>The above text will be removed prior to being published</big>''' ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓
 
  
 
== Overview ==
 
== Overview ==
For text-based interactions between service representatives and customers, a Chatbot is usually the first point of contact and it is always available. When escalation to a human agent is sought by the customer (or triggered by the chatbot), Pega queuing and routing logic kick in.  
+
For text-based interactions between customer service representatives (CSRs) and customers, a chatbot is usually the first point of contact and it is always available. When escalation to a human agent is sought by the customer (or triggered by the chatbot), Pega queuing and routing logic kick in.  
  
 
== Queuing ==
 
== Queuing ==
The first expected action from the customer will be queue selection which is generally a proxy for intent expression. A Chat queue is manned by one or more agents with a capacity to work on one or more text-based customer requests at the same time: concurrency is a configurable option at the agent, queue and global level.
+
The first expected action from the customer will be queue selection, which is generally a proxy for intent expression. A chat queue is manned by one or more agents with a capacity to work on one or more text-based customer requests at the same time: concurrency is a configurable option at the agent, queue, and global level.
  
 
=== Configuring queues ===
 
=== Configuring queues ===
Separate queues are needed for business functions which need distinct set of skills in a service representative. For e.g. Billing could be a queue to direct all billing related inquiries. It is recommended to not configure too many queues as this may result in spreading the available agent capacity too thinly across the queues. As queues cannot be prioritized, all incoming requests will be given equal weight, and this may at times lead to more urgent customer requests remaining queued for longer periods of time. It is also possible for customers to be denied a route to an agent on a particular queue because no available agents have any free capacity.  
+
Separate queues are needed for business functions that need a distinct set of skills in a service representative. For example, billing could be a queue to direct all billing-related inquiries. It is recommended to avoid configuring too many queues because this might result in spreading the available agent capacity too thinly across the queues. Because queues cannot be prioritized, all incoming requests are given equal weight, which might lead to more urgent customer requests remaining queued for longer periods of time. It is also possible for customers to be denied a route to an agent on a particular queue because no agents have any free capacity.  
  
As customer wait times and rejection rates increase, even with a sizeable agent pool, it could point to too many queues. Five or fewer is usually a good number to target but it is dependent on other factors such handle times, number of agents, concurrency limits etc.  
+
As customer wait times and rejection rates increase, even with a sizable agent pool, it could point to too many queues. Five or fewer is usually a good number to target, but it depends on other factors such handle times, number of agents, concurrency limits, etc.  
  
=== Omni-channel queues ===
+
=== Omnichannel queues ===
Queues work the same for all messaging channels including asynchronous channels such as Facebook messenger, WhatsApp etc. Unless it is essential to have a queue specific to the source channel, it is recommended to use the same queue for a particular business function across multiple channels. This helps prevent the proliferation of queues, whose implications are detailed above.  
+
Queues work in the same way for all messaging channels including asynchronous channels such as Facebook Messenger, WhatsApp, etc. Unless it is essential to have a queue specific to the source channel, it is recommended to use the same queue for a particular business function across multiple channels. This helps prevent the proliferation of queues, whose implications are detailed above.  
  
 
=== Customer wait time on queues ===
 
=== Customer wait time on queues ===
While queuing customer requests, Pega considers not only the ''current unutilized capacity of the agent pool'', but also the ''potential capacity'' that will freed up to consume queued chats within a configured maximum wait-time of a customer. By always estimating the wait-time for a customer and evaluating this estimate against the maximum wait-time (as configured by the business, keeping customer experience in mind), you can ensure that an optimal balance is achieved between ''responsiveness to customer requests'' and ''contact center capacity''at any given point in time.  
+
While queuing customer requests, Pega considers not only the ''current unutilized capacity of the agent pool'', but also the ''potential capacity'' that will be freed up to consume queued chats within a configured maximum wait time of a customer. By always estimating the wait time for a customer and evaluating this estimate against the maximum wait time (as configured by the business, keeping customer experience in mind), you ensure that an optimal balance is achieved between ''responsiveness to customer requests'' and ''contact center capacity'' at any given point in time.  
  
 
==== Maximum wait time ====
 
==== Maximum wait time ====
This configuration ensures that your customers will never spend egregiously long times waiting on the queue to connected with an agent. Setting too low a limit will lead to customers being denied a place on the queue too often. So, it is important to find the optimal configuration after taking into account your business objectives.  
+
This configuration ensures that your customers will never spend egregiously long times waiting in the queue to be connected with an agent. Setting too low a limit will lead to customers being denied a place in the queue too often. So, it is important to find the optimal configuration after taking into account your business objectives.  
  
It is recommended to configure the maximum wait time to greater than the average handle time of customer interactions. A wait time between 300 and 600 seconds is generally agreeable to most customers. For e.g. setting maximum wait time to 600 seconds would ensure in cases where customers are expected to wait for longer than 10 times, the Chat application would deny them service and request to retry later.  
+
It is recommended to configure the maximum wait time to greater than the average handle time of customer interactions. A wait time between 300 and 600 seconds is generally agreeable to most customers. For example, setting the maximum wait time to 600 seconds would ensure that in cases where customers are expected to wait for longer than 10 times, the chat application would deny them service and request to retry later.  
  
 
[[File:Picture 1.png|thumb|none|534x534px]]
 
[[File:Picture 1.png|thumb|none|534x534px]]
  
 
==== Expected wait time ====
 
==== Expected wait time ====
Key factors that impact the expected wait-time:
+
Key factors that impact the expected wait time:
  
·      the capacity of the agent pool at any point in time (this would be active agents multiplied by the concurrency allowed for each agent on the queue)
+
·      Capacity of the agent pool at any point in time (this would be the active agents multiplied by the concurrency allowed for each agent on the queue)
  
·      number of queued chats ahead of the customer in question
+
·      Number of queued chats ahead of the customer in question
  
·      number of active chats that the agent pool is currently working on
+
·      Number of active chats that the agent pool is currently working on
  
·      average time to handle a single chat
+
·      Average time to handle a single chat
  
A good balance of these two configurations (image above) would result in the most optimal data set to utilize for computing the expected wait-time. Settling for too small a set of previous interactions or waiting for too large a set can skew the expected wait time. These settings are available '''App Studio > Settings > Chat and Messaging > Chat and messaging configuration'''
+
A good balance of these two configurations (image above) would result in the most optimal data set to utilize for computing the expected wait time. Settling for too small a set of previous interactions or waiting for too large a set can skew the expected wait time. These settings are available in '''App Studio > Settings > Chat and Messaging > Chat and messaging configuration'''.
  
 
== Routing ==
 
== Routing ==
  
=== Workload based routing ===
+
=== Workload-based routing ===
Workload based routing, which is enabled as the default option, suits best in cases where customer interactions are of low complexity and high volume in nature. Selecting this option routes new requests to agents who have fewer active chats. Choose this option if your objective is to uniformly distribute the incoming customer requests among all available agents. Workload based routing is applicable where
+
Workload-based routing, which is the default option, works best in cases where customer interactions are of low complexity and high volume in nature. Selecting this option routes new requests to agents who have fewer active chats. Choose this option if your objective is to uniformly distribute the incoming customer requests among all available agents. Workload-based routing is applicable where
 
* customer issues are generic and not differentiated or complex to require very specific set of skills
 
* customer issues are generic and not differentiated or complex to require very specific set of skills
 
* agent compensation is directly tied to the amount of work they have handled and it is important to distribute work uniformly
 
* agent compensation is directly tied to the amount of work they have handled and it is important to distribute work uniformly
  
=== Skill based routing ===
+
=== Skill-based routing ===
Skill based routing: Selecting this option routes new requests to the CSR with the highest skill level of all the CSRs who are available to take on more requests. Choose this option if customer interactions are of high complexity and low volume in nature. Skill based routing is applicable where
+
The skill-based routing option routes new requests to the CSR with the highest skill level of all the CSRs who are available to take on more requests. Choose this option if customer interactions are of high complexity and low volume in nature. Skill-based routing is applicable where
 
* customer issues are complex and non-generic, requiring differentiated skills in a CSR
 
* customer issues are complex and non-generic, requiring differentiated skills in a CSR
* SLAs need to be adhered to and hence customer issues require the attention of the most skilled CSR available
+
* SLAs need to be adhered to and therefore customer issues require the attention of the most skilled CSR available
  
 
=== Third-party routing ===
 
=== Third-party routing ===
In cases where you need to support blended agents, you can opt for third party routing. With this option enabled, the responsibility of routing incoming chat requests will be delegated to a third-party routing service, which could be the same service that handles the routing of incoming calls. This helps centralize your routing logic in an external service to support agents who can handle both call and text-based customer interactions  
+
In cases where you need to support blended agents, you can opt for third party routing. With this option enabled, the responsibility of routing incoming chat requests will be delegated to a third-party routing service, which could be the same service that handles the routing of incoming calls. This type of routing helps centralize your routing logic in an external service to support agents who can handle both call and text-based customer interactions.
  
=== Concurrency Limits ===
+
=== Concurrency limits ===
Service representatives are typically expected to handle multiple text based conversations at the same time (concurrently). Pega Chat allows you to define this limit at three different levels.[[File:Conditional screen pop.png|thumb|286x286px]]
+
Service representatives are typically expected to handle multiple text-based conversations at the same time (concurrently). Pega Chat allows you to define this limit at three different levels:[[File:Conditional screen pop.png|thumb|286x286px]]
# Global: this value applies to all the CSRs handling text-based interactions
+
* Global: a value that applies to all the CSRs handling text-based interactions
# Queue: a value can be configured at the queue level to specify the maximum number of interactions, on the particular queue, a CSR can concurrently handle
+
* Queue: a value that can be configured at the queue level to specify the maximum number of interactions, on the particular queue, that a CSR can concurrently handle
# Agent: Service managers are provided with an option to define concurrency limits at the individual CSR level to suit the agent's experience and competence
+
* Agent: an option for service managers to define concurrency limits at the individual CSR level to suit the agent's experience and competence
  
=== Conditional screenpop behaviors ===
+
=== Conditional screen pop behaviors ===
In order to ensure that the wait-time estimates communicated with the customer are honored, it is essential that the agents accept chat offers as expected. A level of certainty can be achieved by toggling on the three configurations related to screenpop behaviors at '''App Studio > Settings > Chat and Messaging > Routing'''
+
To ensure that the wait time estimates communicated with the customer are honored, it is essential that the agents accept chat offers as expected. A level of certainty can be achieved by toggling on the three configurations related to screen pop behaviors at '''App Studio > Settings > Chat and Messaging > Routing'''.
  
 
=== Intelligent routing ===
 
=== Intelligent routing ===
It is recommended to utilize the metadata captured for each messaging interaction to auto-direct the requests to specific queues. Identified language, message type and channel data can be used to decide a queue based on the intelligent routing configurations. This avoids the need to expose queue names that are too specific to customers for e.g. Billing-German-Twitter-Public. The customer can simply be shown the Billing queue and the metadata can be used to select the more specific queue internally.
+
It is recommended to use the metadata captured for each messaging interaction to autodirect the requests to specific queues. Identified language, message type, and channel data can be used to decide on a queue based on the intelligent routing configurations. Using the metadata avoids the need to expose queue names that are too specific to customers, for example, Billing-German-Twitter-Public. The customer can simply be shown the Billing queue and the metadata can be used to select the more specific queue internally.

Revision as of 16:46, 21 September 2020

Queuing and routing customer requests in Pega Chat

Description Recommendations for balancing chat queues and agent workloads.
Version as of 8.4
Application Pega Chat
Capability/Industry Area Customer service



Overview[edit]

For text-based interactions between customer service representatives (CSRs) and customers, a chatbot is usually the first point of contact and it is always available. When escalation to a human agent is sought by the customer (or triggered by the chatbot), Pega queuing and routing logic kick in.

Queuing[edit]

The first expected action from the customer will be queue selection, which is generally a proxy for intent expression. A chat queue is manned by one or more agents with a capacity to work on one or more text-based customer requests at the same time: concurrency is a configurable option at the agent, queue, and global level.

Configuring queues[edit]

Separate queues are needed for business functions that need a distinct set of skills in a service representative. For example, billing could be a queue to direct all billing-related inquiries. It is recommended to avoid configuring too many queues because this might result in spreading the available agent capacity too thinly across the queues. Because queues cannot be prioritized, all incoming requests are given equal weight, which might lead to more urgent customer requests remaining queued for longer periods of time. It is also possible for customers to be denied a route to an agent on a particular queue because no agents have any free capacity.

As customer wait times and rejection rates increase, even with a sizable agent pool, it could point to too many queues. Five or fewer is usually a good number to target, but it depends on other factors such handle times, number of agents, concurrency limits, etc.

Omnichannel queues[edit]

Queues work in the same way for all messaging channels including asynchronous channels such as Facebook Messenger, WhatsApp, etc. Unless it is essential to have a queue specific to the source channel, it is recommended to use the same queue for a particular business function across multiple channels. This helps prevent the proliferation of queues, whose implications are detailed above.

Customer wait time on queues[edit]

While queuing customer requests, Pega considers not only the current unutilized capacity of the agent pool, but also the potential capacity that will be freed up to consume queued chats within a configured maximum wait time of a customer. By always estimating the wait time for a customer and evaluating this estimate against the maximum wait time (as configured by the business, keeping customer experience in mind), you ensure that an optimal balance is achieved between responsiveness to customer requests and contact center capacity at any given point in time.

Maximum wait time[edit]

This configuration ensures that your customers will never spend egregiously long times waiting in the queue to be connected with an agent. Setting too low a limit will lead to customers being denied a place in the queue too often. So, it is important to find the optimal configuration after taking into account your business objectives.

It is recommended to configure the maximum wait time to greater than the average handle time of customer interactions. A wait time between 300 and 600 seconds is generally agreeable to most customers. For example, setting the maximum wait time to 600 seconds would ensure that in cases where customers are expected to wait for longer than 10 times, the chat application would deny them service and request to retry later.

Picture 1.png

Expected wait time[edit]

Key factors that impact the expected wait time:

·      Capacity of the agent pool at any point in time (this would be the active agents multiplied by the concurrency allowed for each agent on the queue)

·      Number of queued chats ahead of the customer in question

·      Number of active chats that the agent pool is currently working on

·      Average time to handle a single chat

A good balance of these two configurations (image above) would result in the most optimal data set to utilize for computing the expected wait time. Settling for too small a set of previous interactions or waiting for too large a set can skew the expected wait time. These settings are available in App Studio > Settings > Chat and Messaging > Chat and messaging configuration.

Routing[edit]

Workload-based routing[edit]

Workload-based routing, which is the default option, works best in cases where customer interactions are of low complexity and high volume in nature. Selecting this option routes new requests to agents who have fewer active chats. Choose this option if your objective is to uniformly distribute the incoming customer requests among all available agents. Workload-based routing is applicable where

  • customer issues are generic and not differentiated or complex to require very specific set of skills
  • agent compensation is directly tied to the amount of work they have handled and it is important to distribute work uniformly

Skill-based routing[edit]

The skill-based routing option routes new requests to the CSR with the highest skill level of all the CSRs who are available to take on more requests. Choose this option if customer interactions are of high complexity and low volume in nature. Skill-based routing is applicable where

  • customer issues are complex and non-generic, requiring differentiated skills in a CSR
  • SLAs need to be adhered to and therefore customer issues require the attention of the most skilled CSR available

Third-party routing[edit]

In cases where you need to support blended agents, you can opt for third party routing. With this option enabled, the responsibility of routing incoming chat requests will be delegated to a third-party routing service, which could be the same service that handles the routing of incoming calls. This type of routing helps centralize your routing logic in an external service to support agents who can handle both call and text-based customer interactions.

Concurrency limits[edit]

Service representatives are typically expected to handle multiple text-based conversations at the same time (concurrently). Pega Chat allows you to define this limit at three different levels:

Conditional screen pop.png
  • Global: a value that applies to all the CSRs handling text-based interactions
  • Queue: a value that can be configured at the queue level to specify the maximum number of interactions, on the particular queue, that a CSR can concurrently handle
  • Agent: an option for service managers to define concurrency limits at the individual CSR level to suit the agent's experience and competence

Conditional screen pop behaviors[edit]

To ensure that the wait time estimates communicated with the customer are honored, it is essential that the agents accept chat offers as expected. A level of certainty can be achieved by toggling on the three configurations related to screen pop behaviors at App Studio > Settings > Chat and Messaging > Routing.

Intelligent routing[edit]

It is recommended to use the metadata captured for each messaging interaction to autodirect the requests to specific queues. Identified language, message type, and channel data can be used to decide on a queue based on the intelligent routing configurations. Using the metadata avoids the need to expose queue names that are too specific to customers, for example, Billing-German-Twitter-Public. The customer can simply be shown the Billing queue and the metadata can be used to select the more specific queue internally.