One major pitfall of building chatbots is underestimating the importance of performance. The UI of a chatbot is usually very simple, so it’s easy to forget the complexity behind these virtual assistants. A slow chatbot might be accepted for home projects, but a company can not neglect it. Bad performance is a serious UX killer.
Performance Testing is the key to ensure that your chatbot is responsive under high load. Worst case is maybe a chatbot that does not answer at all, because it was not able to recover after high load caused by clients or some Ddos attack.
To collect some basic knowledge about the performance of your chatbot we suggest starting with a stress test. Stress testing basically means that we are starting with a small number of parallel users that gets gradually increased during the execution. After some minutes it will answer two fundamental questions:
First, the main output is how many parallel users can be served without slowing down the system. Determining which are the slow parts requires of course human interaction.
Second, but not less important output can be detected errors. Usually it is the first production-near scenario for a chatbot experiencing heavy load.
Let’s see some more real life examples…
First Stress Test
By default all parallel users are having a very simple conversation with the bot, saying just ‘hello’ for example. They wait for an answer and repeat the conversation until the end of the test is reached. Any response coming from the chatbot is accepted. So if the chatbot answers sometimes with ‘Hello User’ sometimes with ‘I don’t understand’, it has no impact on the performance results. The main goal here is to get an answer, while a http error code or lack of answer is interpreted as failure of course.
In order to start a Stress Test, we have to select the “Performance Tests” tab in our Test Project, and choose Stress Test there.
The duration can be 1 minute. In terms of parallel users lets use 5 for starting, 1000 for ending. (You can put 1000 there by editing the field directly).
“Required percentage of Successful Users?” can be 0 percent, because we don’t want to stop the test on failed responses (at the moment).
I have two goals with these settings; getting a first impression about the max parallel users the chatbot is able to serve, and discovering how the chatbot breaks on extreme load. In production your chatbot could have much more load than expected, so you have to test it.
This report contains all errors detected by Botium Box. The types of failures are:
Timeout (Chatbot does not answer at all, or too slow)
Invalid response (API limit reached, server down, invalid credentials)
Response with unexpected content (We send ‘hi’, and expect ‘Hello!’, but the chatbot answers with ‘Sorry some error detected come back later!’)
In my case there is just one kind of error in the report:
2168/Line 6: error waiting for bot — Bot did not respond within 10000ms
After 10s Botium Box has given up waiting for the response. The chatbot answer time was over the limit, or did not answer at all. Slowing down is common on extreme loads, but receiving no answer is more critical, that we can’t ignore.
By checking the Response Times Chart in our stress test results, I notice that the chatbot is just too slow. The response time is increasing until it reaches 10s and most of the responses are failed. Next step is now to check chatbot side metrics/logs or increase the timeout in Botium Box. For the sake of simplicity I believe that my impression is correct.