When a user says "It's slow. Sometimes."


#1

I encountered a situation recently where a user was running a desktop app that regularly accessed a set of web server-based resources over HTTP, many of which were backed by a database. They were experiencing intermittent slow performance. I thought it might be helpful to share a couple of steps that helped better isolate the problem.

First, the user was on a desktop system across the network from where ExtraHop monitoring was taking place. To better measure their experience, we created a /32 “remote network” device for their IP address (this was a fixed desktop IP) so that we could see more detail about how they were interacting with the services in the monitored infrastructure.

With that remote network device in place in the ExtraHop, we could now focus on the desktop system, starting with their overall HTTP processing time.

We first established that the user’s shift into a time of slow performance was correlated with the upward shift in overall HTTP processing time below. (Device -> HTTP -> Client -> Response Time Breakdown) Seems obvious, but it’s important to explicitly confirm with the user that shifts like that aren’t explained by some change in their usage pattern.

Now, to compare “before” and “after” the upward shift, we opened a couple of tabs to the same device view and set the time selectors to windows before and after the shift. Then to break down their traffic, we clicked on the triple-bar graph in each tab:

Then we clicked over to the “By URI” view.

Comparing the mean HTTP processing time for each URI between tabs highlights exactly where (what URI’s) the extra processing time is coming from.

Yes, I’m a little sheepish about busting out an animated GIF. But only a little bit. :slight_smile: And the red-green colorizing is not an ExtraHop feature you haven’t discovered, just my added effect for emphasis.

Bottom line, focusing on the specific URIs whose response times would get significantly longer and matching them to application data flows helps isolate where to spend (and where not to spend) application debugging cycles getting this user’s performance back up to speed.