Due to the key advantages discussed in the previous section, our solution - Test Lab - takes the A/B testing approach to optimizing applications. In this section, we will walk through a more detailed introduction to A/B testing, its key use cases, and implementation options.
Small changes can have a big impact
In established applications, even the smallest changes can have a significant impact that may not be predictable or obvious. Presenting an application change to a “test” audience of users and observing their behavior can add empirical data to the process of making improvements. This can increase conversion rates or alter desired user behavior.
Google famously tested over 40 different shades of blue for ad links, resulting in a final color choice that added nearly $200M in revenue. As you will see in the upcoming discussion, A/B testing allows companies to make informed decisions about features, design choices, or even hardware based on quantitative data points and analytics. Changes to the application based on these decisions can increase desired user behavior and may help identify more efficient or cost-effective solutions. A/B testing prior to rollout also reduces the risk of making a detrimental design choice based on perceived aesthetic preference of a designer, developer, or executive rather than actual user behavior.
Use Cases for A/B Testing
Although A/B testing is most commonly used to test visual elements of a website or application, it can aid in a much wider range of decisions. Use cases for A/B testing can include not only design changes but also feature experimentation, changes to hardware or backend software, and testing of third-party APIs and services.
UI and Design Changes
While simple decisions about a color or the shape of a button might seem minor, these design choices can have significant consequences. As mentioned above, in 2009, Google famously implemented a test in which they tested 41 different shades of blue for their links and compared user behavior with each variant. Not only did they identify the shade that users preferred, but they also predicted that making the switch to that shade would result in an additional annual revenue of about $200 million. This may be an extreme example, but it does illustrate that, at scale, even subtle changes can have a dramatic impact on user behavior. Rather than leaving these changes up to guesswork, it is worthwhile to obtain actual user analytics before deploying a potentially costly “minor” design change to the entire user-base.
Larger changes, such as adding additional functionality like a search bar or a “like” feature to a website, may require a considerable amount of up-front engineering cost to implement. It can be significantly more difficult to implement changes to these types of features. A/B testing allows developers to expose a new feature to a small subset of users and evaluate their response prior to a wide rollout. This approach can be far more effective than releasing a new feature to the entire user base, only to find out it has negatively impacted user behavior. A/B testing helps to take the guesswork out of rolling out and refining new features.
Changes to Hardware or Backend Software
A/B testing can also be used to test other types of variants that do not result in a visual change for the user, including hardware or backend software changes. Hardware changes, in particular, could yield cost savings for a company. For example, the slightly increased latency of a downgraded server or alternate service may not be as detrimental to user conversion as anticipated. For other apps, it may be best to upgrade to the most performant architecture the company can afford to maximize potential user conversions – a 100ms delay in loading can hurt conversion rates by as much as 7% according to a 2017 Akamai study. A/B testing can be used to determine which changes are less likely to impact a user's experience.
Testing APIs / Third-Party Services
It is often efficient and economical to rely on APIs and third-party services for portions of an application. Building and maintaining complex features and functionalities from scratch can be time-consuming and expensive. APIs and third-party services offer pre-built solutions that can be integrated into an application, saving development time and resources.
When considering third-party services, there may be many different options with distinct features, computational and financial costs, and user interfaces. As an example, there may be a significant difference in conversion rates if you allow users to utilize PayPal for their purchases as opposed to only credit cards. There are trade-offs in terms of the cost of using these services, but the increased conversion rate may outweigh the increased cost. Testing these changes in an A/B test prior to rolling them out for all users allows for a clear understanding of the risks and rewards.
Implementing A/B Testing
As A/B testing becomes more popular as a way to improve a website or application, it is important to consider the different ways in which tests can be implemented. Depending on the application, A/B tests can be implemented on the client-side, server-side, or CDN level, each of which offers unique advantages and challenges.
The user sees the modified version of the page and interacts with it in the usual way.
A key advantage of client-side implementation is that it’s easy for developers, as there are many existing third-party services (for example, Google Optimize, Optimizely, and VWO) to support this type of implementation. There is no inherent need to build a testing platform from scratch, and many of these tools can be used with limited developer expertise. In fact, VWO goes so far as to state that,
On the downside, A/B tests with client-side rendered applications can cause a strange user experience, as you have to re-render the page for the test group after having the original page flash briefly on their screen.
Client-side implementation can also have an undesirable impact on sites that utilize React, Angular, and other libraries and frameworks, as the DOM will not match the virtual DOM for those who are part of the test group. When the virtual DOM and the real DOM become out of sync for the users in the test group, this can cause unexpected behavior, errors, or bugs that do not accurately reflect the performance of the original web application.
Server-side A/B testing is performed on the server-side, which means that it happens before the page is served to the user. Server-side A/B testing can be used to test any aspect of a website or application that is controlled by the server, including content, layout, and functionality.
One advantage of this approach is that the server has full access to user data, so you are better able to select specific users for testing, based on criteria you determine to be relevant. Client-side A/B testing generally only has access to data that is available in the user's browser, such as cookies or local storage. However, server-side A/B testing is done on the web server, which has access to a much wider range of user data, including the user's IP address, session data, and other information that is not generally available in the browser. This additional data can be used to provide more detailed and accurate analysis of user behavior and performance metrics.
Server-side A/B testing platforms tend to be custom-built rather than provided by third parties for several reasons:
Customization: Server-side A/B testing often requires customization to fit the specific needs of the application or website. Custom-built platforms can be tailored to the specific requirements of the business and can be designed to integrate with existing systems and infrastructure.
Control: Custom-built A/B testing platforms provide more control over the testing process, including the ability to adjust testing parameters, add or remove tests, and modify the underlying code.
Security: Server-side A/B testing involves sensitive data and code that is executed on the server. Custom-built platforms can be designed with security in mind and can be audited and tested to ensure that they meet the highest security standards.
Scale: Large-scale A/B testing can involve processing massive amounts of data and traffic. Custom-built platforms can be optimized for performance and scalability to handle large volumes of traffic and provide real-time results.
Cost: Third-party A/B testing platforms may come with additional costs, such as subscription fees or transaction fees. Custom-built platforms can provide cost savings over the long term, especially if the business has specific requirements or needs that are not met by third-party solutions.
Since there are fewer third-party options for implementing server-side A/B testing, more developer expertise is required. Peter Koomen, co-founder of Optimizely summarized this key trade-off of client- versus server-side A/B testing:
"The advantage of testing on the client side is speed and simplicity. You can test a lot of changes quickly without much initial investment. On the other hand, testing on the server side is both more work and generally more powerful."
Since you have entirely separate branches of code for your existing version and your test version(s), it is easier to fully switch the entire user base to the successful version after the experiment is complete. This approach is ideal for static sites that are already hosted on a CDN, as many CDNs already offer this service. In addition, since users are sent directly to one site or the other, you don’t have the potential “flash” of the original site for users in the test group.
Another limitation of CDN A/B testing is that it may not provide accurate results for users who are located far away from the edge servers, as the latency and network conditions may vary widely depending on the user's location. Additionally, CDN A/B testing may not be suitable for testing certain types of content or functionality, such as e-commerce transactions or user authentication, which may require more advanced testing methods.
In summary, while CDN A/B testing can provide some benefits, such as improved website performance and reduced latency, it is limited in scope and may not be suitable for all types of content or functionality.