ReadWriteWeb recently posted an article titled Major Internet Incidents and Outages of 2010 in which they describe the ten biggest Internet and website outages of the past year. The list of companies with major meltdowns was alarmingly extensive and included heavyweights such as Wikipedia, Google, Twitter, Facebook, Tumblr, and others.
What's even more troubling is that for every reported blackout, there are certainly hundreds or even thousands of brownouts. In the case of blackouts, problems immediately make themselves known and typically are covered by a service level agreement (SLA) guaranteeing network availability. Unfortunately, brownouts—performance degradations that are perceivable by end users—are much more elusive and render traditional SLAs inadequate. To make matters worse, network performance management (NPM) tools, which should enable IT staff to recognize and address network brownouts, have not yet evolved to meet this challenge.
With higher expectations and increased competition, it's critical that companies deliver a great experience to their end users through consistent network performance and quick resolution of any problems that occur. The focus of network performance management must shift to accurately track success versus failure as well as true application-transaction processing time.
Traditional NPM tools fall short in identifying and managing brownouts by failing to provide the necessary application visibility to spot brownouts effectively. These tools also do little to speed up mean time to resolution (MTTR) because they are reactive and don't look across all tiers—application, network, storage, database, and web. Finally, they tend to flood IT staff with alarms that do not correlate to end-user issues, failing to help IT operations staff prioritize effectively. Ultimately, these inadequacies damage the company brand and reputation.
A new generation of application-aware network performance management (AANPM) technology is required to help manage brownouts. These solutions must be proactive, integrated, intelligent, and provide comprehensive network and application visibility in real time. They also must provide insight as to how each network element impacts other elements, even across silos. And they need to help reduce mean time to resolution (MTTR), which involves baselining performance, understanding the norm, and proactively alerting when problems occur. Furthermore, with dynamic and virtualized environments, it's important that these systems continuously auto-discover new devices and adjust to changing network and system elements, something agents can't do. Network and application performance is tied to revenue generation and their associated SLAs, making these advancements critical to the success of a business.
That's why it's no longer enough to monitor the up-versus-down status of thousands of discreet elements—you need a top-down view to monitor the performance of your applications in real time. With this understanding, we built the application-aware ExtraHop Application Delivery Assurance system, which provides trend-based alerting and proactive early warning. The ExtraHop system changes the discussion from "What happened?" to "What's happening?" and ensures that your business-critical transactions do not fail.
If you want to take the ExtraHop system for a spin, try out our free web service at NetworkTimeout.com. Then go check out our case studies to learn how our customers are benefiting from the ExtraHop solution every day. We'd love to hear your story in the comments section.
This is a companion discussion topic for the original entry at http://www.extrahop.com/post/blog/extrahop-analysis/application-performance-management-blackouts-brownouts/