Testing software, end to end is one of the important things some organizations don’t take as seriously as they should be. If you don’t spend enough effort on it, you’ll have no idea what users/customers are facing when they’re using your product.
Even if the unit tests are green and the integration tests are in place, there are endless possibilities for your software to mess up customers' experience. This could be the result from issues in software packaging, deployment and how the target environment is set up. For a successful user experience, you need to have a nearly identical copy of your app to be tested as if a customer would be interacting with it. This is what we would refer to as end to end tests.
Most companies have a QA department where the team checks the product before a feature release or while a bugfix has been deployed. This is time consuming and it may not be even feasible in modern workflows where you want to deploy automatically whenever somebody merges the work to the product’s repository. Automation is a must.
Historically, people tried different approaches to do E2E testing. Each of these may have its pros and cons that are somehow scoped to the capabilities that were available at the time of that tool or framework. Legacy frameworks have been proven not to be stable enough which resulted in framework-related flakiness and some may lack capability to test specific cases. The good news is that, fortunately, E2E solutions are getting better over time.
At Sensor Tower, we follow the Test Pyramid strategy. The capstone of our automated testing infrastructure are E2E tests. In the past we used using Cucumber to write test scenarios as behavioral specifications like the following:
At first sight, it may seem easy and straightforward but there is a considerable effort under the hood to make it work in practice. Developers need to implement the individual Capybara step definitions then test authors would use those defined steps to write scenarios. Over time, new step definitions need to be defined or existing ones would need to be refactored and generalized.
Overall we were not getting much added value for these efforts and although in theory this would enable non-engineers to write test cases, this has not materialized in practice at Sensor Tower. In many situations, even simple scenarios have basic problems that require software engineering skills and debugging skills that are too complex for non engineers.
The Cucumber stack doesn’t come with out-of-the-box convenient tools for debugging and troubleshooting such as video recording, time tracking and js-context-aware debugging ability.
To summarize, here is why we migrated out of Cucumber:
Adds overhead if you're not following Behavior Driven Development strictly.
Requires implementation in Ruby to write and maintain Cucumber step definitions which is not a core skill of the front end team.
There are no out-of-the-box video recording artifacts for failing or flaky scenarios (scenarios that succeed sometimes and fail sometimes).
In practice there is not any added value with Capybara since it’s not easy to get non-engineers to write feature files even if it looks like plain English.
Cypress has a set of features that allows the development team to:
Support video recording for failing and flaky test cases equipped with a detailed command log and the steps can be debugged in the recorded video
Built in command retries which helps to avoid using any explicit waits.
Tests are written in Javascript.
Uses the well-known Mocha framework.
Runs inside the browser and it provides access to native JS objects which helps with debugging and troubleshooting.
When used with Cypress.io, provides dashboards that tracks:
Test suite duration
Test suite size
Durations at test file level
Flakiness percentage and flaky specs
Migrating a project to a new framework requires a lot of work since each of the existing tests needs to be converted. For some specific types of tests it is complex and sometimes straightforward. Over time you will establish some patterns that allow you to identify how to convert specific tests to the new framework. For example, you need to find some tricks to solve all the cases where you need multiple tabs and to remove all dynamic waiting statements by using the retry-ability concept.
While doing this it is a great opportunity to re-evaluate the need for each of the specs and to get an understanding of the coverage of each spec.
I’d recommend the following approach:
Do I need this spec?
Is it covering the important use cases?
Do we have duplicate cases (coverage overlap)?
On the Cypress side, writing tests is not a difficult process most of the time, but in some situations, it becomes very tricky. In this article, we'd like to share some of the challenges we faced while converting our test suite to Cypress.
When testing dynamically loaded content, we need to wait for content to be fully loaded. In some frameworks/libraries this may be done through special wait commands that waits for a specific element to be visible or when a specific event is triggered..etc. In Cypress they have a concept called “retry-ability” meaning that all commands retry their assertions until they’re met (within a timeout) This allows writing your test code as if things finish loading immediately after you interact with the user interface and Cypress will handle retrying the failing command until it succeeds.
Example:
In this case, we can write the test as if the comment will render right after clicking on the “submit” button while in reality the third line will fail and keep failing until the AJAX request returns with data and fills #content-container
with my new comment
.
Awesome! This appears to be the magical solution to any asynchronous loading challenge; however, in some situations, it may not work as expected. In the following example:
It would appear as if it can be tested like:
This would work only in the case the item is going to be inserted into .element-container
without re-rendering .element-container.
In other situations where the .element-container
is completely re-rendered, it'll result in making the element object caught by .get('.element-container')
detached from DOM and replaced by a new one. Accordingly, the .get('.my-expected-element')
will wait/retry forever for an element that will not exist.
So, what's the solution? Keep your element queries grouped as much as possible:
When writing tests, any type of flickering is your enemy. It may work once, but on subsequent executions it could fail.
So, it's important to solve flickering issues in your web app when you start to see random "element is detached from DOM" errors, but make sure you don't have retry-ability issues (like the one explained in the first point above) in your testing code (check this blog post for more info).
Although an element is clearly visible to you, Cypress might not be able to see it or to interact with it. That's because Cypress has some rules to decide if an element is Visible and Actionable.
Checks and Actions Cypress Performs (as mentioned in their documentation)
Whenever Cypress cannot interact with an element, it could fail at any of the above steps. You will usually get an error explaining why the element was not found to be actionable.
You can force Cypress to click on something using force: true
like .click({force: true})
click on something that it considers that it's not visible or not actionable, but it's not a good practice since in some cases, it can backfire on you by clicking on some other really non-visible element.
This usually happens when you use some HTML/CSS tricks and workarounds to implement some UI parts. For example, when you put nested menus as children to their parents causing them to be rendered outside of their parent's bounding box.
The Cypress tests launcher supports parallelization and Cypress.io will intelligently orchestrate the spec distribution process, but it won't orchestrate the system-level parallelization in terms of running and stopping Cypress instances. You will have to manage the parallel instances of Cypress yourself. It may be provided for free by docker orchestration systems but if you want them to run on the same machine without docker, you need to do some extra effort.
The alternative to using docker would be to implement your own system for launching N instances of Cypress passing required parameters, and to manage their exit codes, and signal handling.
For all of this to work in practice you need to set the environment variable CYPRESS_trashAssetsBeforeRuns
to false
otherwise cypress instances will clear each other’s assets.
It might not be obvious for a novice user but, Cypress will start an Xserver session to run Chrome (or any configured browser) if there is no active one. This allows it to run properly even if the CI server does not already have an X server running.
If you intend to run tests in parallel by running multiple instances this means that you will need to start and stop Xserver sessions yourself or subsequent Cypress executions will reuse the first one. This has the side effect that if the first one finishes early, it'll terminate its Xserver which will cause all other executions to fail.
As the name End to End test implies the web app should be tested as a contained element from the outside with no interception of what's happening inside since the app is being tested as a whole.
But, unfortunately, in some cases, you may find yourself making a very flaky set of tests because they depend on many factors like for example your staging/testing DB state and availability.
In many cases where you can prepare a completely isolated environment that you can reset before testing, it's recommended that you don't mock or intercept any request. In the opposite case, you may need to intercept the request to get an idea of what's going to be rendered or to use some fuzzy testing techniques not to set too hard expectations.
In some specific cases, you may have no option other than mocking some heavy APIs to gain more stability and performance.
One of the most important characteristics of a test suite is its performance, no one wants to wait for 1 hour for a commit to be built for example. The more you add specs the more your build time is going up. Using parallelization would help for sure, but at some point, the E2E test suite will start behaving like a load testing suite
At Sensor Tower, a smaller server for staging environments is used and even smaller ones for staging servers intended to preview Pull Request changes. It’s really important to tune the number of parallel Cypress instances to match the capabilities of your PR web server assuming that the server itself is already tuned to have the maximum available number of web workers (depending on what tech stack you have). A recommended approach is to monitor requests wait time and duration while increasing the parallelization factor. Seeing high values there is a good indicator that you’ve reached the maximum parallelization factor for this web server’s size and capabilities.
The current test suite consists of 450 specs in around 8 minutes using 11 parallel Cypress instances targeting a 4 vCPUs machine. Duration can be reduced by increasing parallel Cypress instances while monitoring your staging server capability of handling concurrent requests. Otherwise, you’ll end up increasing request duration and wait time ending up with the same test suite duration or worse.
Cypress gives some recommendations regarding how much time you’d gain if you add N machines more and so on, below you can see a screenshot from Cypress.io in this regard.
For this to be accurate, Cypress assumes that your server response will be the same while in reality the size of machine people usually pick for staging machines may not sustain the same response time when going up to 20 cypress instances bombarding the endpoint with requests. Keep in mind, there is no user that’s as fast as a Cypress instance is, so don’t think of this as they’re equivalent to 20 Customers browsing your app.
If you’re going to run Cypress in CI server, keep in mind that you need to do the following:
Write a wrapper script to launch an Xserver for your Cypress instance. Using Xvfb will be helpful in this regard.
It’s recommended that you run your tests in parallel. In order to do that, you need to write a wrapper script and to use Cypress.io dashboard.
In case of parallelization in the same machine, please set the environment variable CYPRESS_trashAssetsBeforeRuns to true (as mentioned above).
Do run a big number of Cypress instances at first setup. Start with a low number and increase over time as long as the test suite is stable until you reach your maximum number of parallel test runs.
Migrating to a new E2E testing framework is not a small project. It’s very important to make a plan before you start to convert your entire suite. The strategy for the work is established by prototyping by identifying specific types of tests and experimenting with their conversion to establish best practices.
While the conversion process is ongoing, it’s also important to keep a close eye on unexpected challenges and to adapt accordingly by adjusting the plan and finding the appropriate long term solutions.
It’s important to thoroughly understand each of the primitives you use. Reading the documentation and testing your assumptions is key.
Some methods belong to features that may get deprecated soon. If ignored this could lead to a situation where you will incur extra cost when trying to upgrade Cypress version and might lead to major refactoring in the future.
Migrating to a modern browser end-to-end framework is a big undertaking but by doing so will lead to many benefits that pay dividends for years to come such as less flakiness, better observability, faster execution, faster and more clear test implementation. The increased observibility will uncover a set of useful insights that will be a game changer allowing you to spot flaky tests, split long ones and optimize slow ones.