It’s becoming obvious that there may be a bit of a sticker shock when it comes to cloud costs. While customers have benefited from pricing competition between the major cloud vendors—Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform—there are still plenty of hidden costs and upsells that lie in wait for the unsuspecting consumer making cloud initiatives rather more costly than originally presumed. This is especially true when factoring in relative unknowns like serverless, containers and Kubernetes services. For the third year in a row, surveys show optimizing cloud costs remains top of mind for IT managers embarking on major cloud initiatives.
Code optimization can help you save on cloud costs
Along with carefully considering your options with multi-cloud cost calculators, addressing inefficiencies in the code itself can greatly help optimize cloud usage costs. If your apps aren’t properly optimized for the cloud, unforeseen costs can quickly escalate leading to expensive over-budget operations.
Let’s dig in to an example of how we’ve been using Application Performance Management (APM) here at Riverbed to monitor and optimize our own cloud-hosted SaaS offerings. We spoke to an internal SaaS product team (who will remain anonymous to protect the innocent!) on the ways in which they’ve benefited from adopting an APM-based code audit practice.
SLA: Ensuring 99.99% uptime
This Riverbed SaaS application is a mission-critical multi-tenant system hosted on about 120 AWS servers that requires a 99.99% uptime SLA. We currently have a growing number of 10,000+ customers managing about 150,000 pieces of equipment with the application, collecting thousands of data points every minute from every device. While the system is designed to be extremely cost-efficient, AWS represents a significant portion of our overall operating cost, and keeping clouds costs low gives us a competitive edge.
The application has a complex microservices architecture comprising of distributed services for authentication, monitoring, polling, processing, etc. written in Java/Spring Boot. A large number of 3rd party libraries and open source components are used for caching (Hazelcast), database (Cassandra), messaging (ActiveMQ) and more. The DevOps team has been using Riverbed APM for monitoring, diagnostics, and bookending code changes, to support about 8 releases a year. Optimizing every new feature that is developed is a top priority.
According to the engineering team lead, “We were able to do a lot of additional code optimization and see a lot of cool stuff with AppInternals. AppInternals allowed us to find deeply hidden problems that we could not find otherwise.”
How to find and fix inefficiencies
Here are 5 ways our SaaS development team was able to use Riverbed APM to optimize their code-level application behavior and save on cloud costs.
- Employ asynchronous message dispatch: The team noticed a good amount of time was being spent on sending messages with the ActiveMQ message broker. Once it was discovered that messages were blocked waiting to be sent, they were able to adjust the ActiveMQ settings to enable asynchronous dispatch. This greatly improved performance and processing capacity.
- Resolve transaction boundary issues: The team discovered cases where transaction boundaries and Aspect injection points were set inefficiently. These issues were causing increased database resource usage and CPU/memory spikes. Detecting and solving these boundary issues helped them reduce resource usage, lowering CPU usage in the busiest app components by about 30%, and significantly impacting overall cloud usage costs.
- Identify (caught) exceptions: Exception analysis with Riverbed APM was very helpful for the team in finding issues. Without APM, they would not have known about the unusually large number of caught exceptions. As a result, they were able to investigate further and address the reason for the abnormal number of exceptions.
- Cache redundant method calls: A number of code inefficiencies were identified where the exact same call with the same result was called multiple times. Removing these inefficiencies with structure transformations or using caching more effectively helped them improve performance. Now, every time they see an “innocent” call bubbling to the top, they know that it needs to be investigated.
- Pinpoint issues with 3rd party libraries: In spite of years of knowledge, usage and experience with the 3rd party libraries and services used in the product, the team was able to identify a number of previously unknown issues in these components using Riverbed APM. This visibility was crucial in enabling them to find and fix issues and optimize their application.
Global public cloud services and infrastructure spending will more than double in the period 2019-2023, from $229 billion in 2019 to nearly $500 billion in 2023, according to the latest report from IDC. SaaS represents half of all current public cloud spending, with IaaS and PaaS growing the fastest.
As cloud continues to occupy a growing share of IT spending, we recommend a thorough code audit during the testing and development phase of the application lifecycle to make sure your code is optimized for the cloud. With a big data approach, Riverbed APM enables you to perform a global application analysis in a way that is not possible with incomplete data sets based on sampled transactions (see Why a Big Data Approach is Key for APM). Clear insight into conversion paths and usage trends for your application can help you pinpoint the end-to-end transaction paths that are prime candidates for optimization. By exposing inefficiencies in the most used features, Riverbed APM can help you ensure new releases don’t introduce unnecessary risk and cost to your cloud application service.