With Aternity Digital Experience Management (DEM), no-sampling transaction tracing maximizes visibility into enterprise applications to pinpoint issues and minimize incident downtime. The IT team at one of our pharmaceutical customers used Aternity to track down the cause of crippling performance impact to legacy code, which compromised a key patient service while jeopardizing revenue in excess of $1M. The resulting P1 incident was root-caused, tested and hot-fixed within one week.
Incident impact on patient appointments and revenue
Aternity has partnered with leaders in the pharmaceutical industry for years, to enhance their ability to serve their customers. With patient care as their north star, the organization described here depends on Aternity to assure the performance of a critical Portal application that allows users to manage their profile and schedule in-person appointments. Each successful appointment shapes patient experience and impacts operational cost, reputation and revenue. Therefore, the number of successfully scheduled appointments is considered a key operational performance indicator.
Scheduling an appointment via the Portal requires multiple steps:
When response times for some of these steps take minutes instead of seconds, it doesn’t take long for users to abandon their attempt to book an appointment. It makes sense: to most users, this wouldn’t be seen as slowness; it’s unresponsiveness. Operational disruptions that result in poor digital performance can impact customer care and cause catastrophic business results.
With Aternity, from initial assessment and identification, to QA bugfix testing, to production deployment of the fix, this critical incident took five days to resolve. The first day after the fix was applied, there was a surplus in scheduled appointments, and operations then returned to typical levels.
The larger question remains: what caused this $1M incident?
The challenges of legacy code
In business-critical enterprise applications, application code can age rapidly. Design objectives, bugfix priorities and new feature releases continuously shift, driven by the business. A development cycle that keeps up with the demands of a rapidly changing market or user base can, at times, leave defunct code untouched, like a ghost with some serious unfinished business. And as ghosts tend to do, they can come back to haunt your application’s performance (and your revenue!) when you least expect it.
Chasing a shifting bottleneck
The Portal application’s critical response time degradation was signaled and triaged on minute one via Aternity transaction metrics and alerting:
Aternity showed that database performance was causing the poor application response time and resulting drops in appointments set. After extracting SQL performance detail from Aternity, the company’s DBA identified that some database temp tables had grown quite a bit and some cleanup was required. After a heavy overnight database optimization effort, on the next business day…
Performance is even worse! All that DB optimization effort, and the application still faces a critical performance impact. Notably, the bottleneck has shifted – now it’s the application code that is performing poorly.
Hunting down the ghosts
With a clear application code bottleneck (and lost revenue mounting hourly), Aternity shined a light on the code and exposed those hidden ghosts. Namely, it identified the bottleneck application methods in one click:
Capturing all transactions enabled Aternity to show a clear queuing pattern in the response time of the pharmaceutical company’s “Portal – Appointments” requests, and tie it specifically to business-hours load:
When looking at details for each individual method call, the application just waits – no actual work is being performed:
One look at this data by the application’s developer helped identify a series of application methods that use the “synchronized” attribute:
Methods had been synchronized for this application a long time ago (possibly years prior), when data conflicts by users accessing the same database records were a concern. However, some time afterwards, database tables had been re-organized for scalability, and now data conflicts were extremely improbable, making method synchronization an unnecessary legacy feature. Indeed, the developer commented he was confident that synchronization could make users “wait in line” for no reason at all, causing the queuing patterns we see above.
After a day of testing the code change (which came down to simply removing all “synchronized” attributes), a hotfix is introduced and…
And what about appointments scheduled? Back to normal!
Empower your development cycle and eliminate legacy code issues
With Aternity visibility, identifying critical application bottlenecks is one click away. And since Aternity includes ALL user transactions –it does not sample–, it provides a complete picture of performance to application owners and developers, giving SMEs the keys to solving issues quickly. In this case, being able to identify a clear queuing pattern in the slowest transactions was key to identifying the offending method synchronization and address the main incident at the first code change.
Start solving YOUR $1M incidents
If you’re not yet an Aternity customer, you can explore how we companies around the world assure the reliability of their services with digital experience management that captures all transactions for all users. You can get started today by registering for a free trial of Aternity running in your environment. You’ll see how your organization compares to the market with the benchmarking insights from millions of end points monitored in via Aternity SaaS. You’ll see how your Service Desk can drive down costs and improve service with AI-driven automated remediation. And you’ll get a view of employee experience for every app running in your environment – even SaaS and Shadow IT.