We changed our ways when we released additional features two months after the official release.
Immediately after the stable release, we found three bugs. They had slipped through because of human error during manual E2E testing and because we failed to verify affected parts after a hotfix. Fortunately, there was no direct impact on customers, but I felt an enormous sense of responsibility for releasing something that contains three bugs.
The MVP is becoming more widely accepted, and the number of customers is increasing. We have plans for adding even more features as we work towards PMF, but if we continue doing things the same way, there will be a critical bug at some point that will affect customers.
The number of items we check in E2E testing has increased to 900 items. So far, we’ve managed to get to a stable release in two days after deploying from the staging environment, but this release cycle could come crashing down on us at any point.
Given all of those reasons, we decided to allocate 20% of the development resources to quality assurance and write tests.
Ensuring Quality while Maintaining Speed
Aim for 70% Unit Testing?
If you decide to write tests, you should follow the status quo and climb the Testing Pyramid from the bottom and aim for 70% unit testing first. However, aiming for 70% wasn’t the right move for us, considering our phase.
I think aiming for 70% would be fine if the product is highly likely to win in the market and already has enough QA staff. It would also be an option for business systems where you can’t roll back very easily or systems where even a single bug would be a catastrophe, such as systems in medicine, government, or elections.
Since SaaS is used in business operations on a daily basis, it would cause a great deal of inconvenience to the customers if our business stopped. Quality is vital to a product, but if you can’t add useful features or make improvements, the customer’s experience will suffer and could kill the business.
Startups may be full of dreams, but they are always short on engineer resources. Dealing with bugs and regression is time-consuming, so it’s important to inch forward while identifying areas where writing a test would accelerate development.
We don’t have a defined percentage. Instead, we identify the areas prone to bugs and regression and the features where bugs would be critical. We then spent 20% of our resources implementing unit tests and API tests in those areas.
By running E2E testing on all other areas, we believe we can keep the speed while maintaining a certain level of quality.
Introducing an E2E Test Automation Platform
Aside from implementing unit tests, we also need to figure out how to manage the bloated E2E test situation. Manually testing 900 items in two days is already overwhelming, and we were asking people outside the development team for help.
So, we introduced a no-code E2E test automation tool. It’s a capture replay tool, which records the E2E test operation and executes it automatically.
After using it, I found that the no-code E2E test automation tool is a tool that can change the way we climb the traditional Testing Pyramid and change how we allocate resources. Traditionally, only an engineer could implement tests. So, the only way to climb the pyramid was to go from the bottom.
However, with the test automation tool, engineers could climb from the bottom while someone else could descend from the top. Besides, there are areas where unit tests can be substituted, and engineers can continue testing accordingly.
Some people have told us not to use capture replay, but I think what they actually mean is that we shouldn’t rely on it just because it looks convenient. We should think of it as tempting ice cream. Although it’s automated, it’s still an E2E test. It’s prone to breaking, it’s slow, and it’s challenging to identify the cause of an error. Our way of doing things here at LayerX is to know the limitations of tools and SaaS and use them anyway.