Business press agrees that there are numerous potential benefits of chatbots, and significant funding has poured into chatbot companies to realize this potential. For this potential to materialize, chatbots need to be successful however there are numerous public chatbot failures which could be limiting the growth of bots. Effective testing can reduce chatbot failures. We compared 7 chatbot testing frameworks which include comprehensive chatbot testing approaches, chatbot testing software and chatbot testing services.
What are important chatbot testing concepts?
Most testing approaches lack standardization as it is hard to quantify frequency of conversations that test cases cover, especially before a bot is launched. Aim should be to cover most likely scenarios throughly. For example, Chatbottest is an open source project that provides a database 120 questions to test the chatbot and user experience.
The concept they developed follows a Gaussian nature. The test mechanism developed broadly follows three categories. Expected scenarios, possible scenarios, and almost impossible scenarios. This scenario testing structure can be mapped to sigma distances.
Empirically, after testing for almost impossible scenarios which can be considered as the 3-sigma distance, the chatbot performance would be observed for 99% confidence interval. It would be costly to test further since there is an infinite combination of ways humans can use language.
Areas for testing
Chatbottest provides 7 broad categories for testing
- Personality: Does the chatbot have a clear voice and tone that fits with the users and with the ongoing conversation?
- Onboarding: Are users understanding what is the chatbot about? and how to interact with him from the very beginning?
- Understanding: Requests, Small talk, idioms, emojis… What is the chatbot able to understand?
- Answering: What elements does the chatbot send and how well it is doing it? Are they relevant to the moment and context?
- Navigation: How easy is to go through the chatbot conversation? Do you feel lost sometimes while speaking with the chatbot?
- Error management: How good is the chatbot dealing with all the errors that are going to happen? Is able to recover from them?
- Intelligence: Does the chatbot have any intelligence? Is able to remember things? Uses and manages context as a person?
What are chatbot testing frameworks to put these concepts to practice?
|Framework/software||Source code||Contributors on Github||Last commit on Github||Notes|
|Botium.at||Open||10||8/Dec/2019||Chatbot test automation|
|chatbottest.com||Open||3||8/Oct/2018||Set of questions to standardize chatbot testing|
|dimon.co||Propriatery||Chatbot test automation. Dimon has integration with major platforms such as Slack, Facebook Messenger, Telegram, and WeChat|
|qbox.ai||Propriatery||NLP training data optimization|
|Zypnos.com||Propriatery||Regression testing for chatbots|
What are the limitations for chatbot testing?
Continuous effort is required to ensure that tests remain up-to-date
While standardized tests are crucial, they need to remain dynamic in line with the development of the bot. For example, if we create a test for a specific expression (talk to operator) to address queries by customers that want to talk to customer service agents, we need to ensure that similar tests in other languages need to be prepared when our bot is launched internationally.
This is a common phenomenon. Goodhart’s law states that once a social or economic measure is turned into a target for policy, it will lose any information content that had qualified it to play such a role in the first place. Therefore, keeping the testing process as dynamic as possible will make the whole testing process more meaningful and would reduce fragility of the chatbot.
Testing can create a false sense of security
As explained above, static tests lose their relevance over time but a large number of tests, regardless of whether they are up-to-date or not, create a sense of security. However, as tech leaders know quite well, only the paranoid survive.
Check out our previous articles for more on chatbots:
- General guide about chatbots
- Objective metrics for measuring the performance of a chatbot so you can measure results of testing
- Chatbot success stories. We recommend reading it since success stories are rare and since they can be studied to learn drivers of chatbot success
- Chatbot testing guide focusing on a/b testing
Are you looking for an AI solution? Let us know. We can find the best AI partner for your business.