One of the things I enjoy the most about being a Software Test Engineer is finding inspiration for bug-hunting in unexpected places. Recently, I was reading Stephen Bergman's 1978 book "The House of God," in which a medical intern bucks the trend of continuously whisking patients from department to department, and instead lets the patients rest in one place for several days. The epiphany is that patients will either heal faster, or have more easily diagnosable symptoms emerge with some time to rest in place.
This gave me the idea to try just leaving a program running idle over several days (such as over a weekend) and record information about it at the start and end. By doing this, we can see when resource consumption continuously rises over time, a type of application bug known as a resource leak. Resource leaks are dangerous as they can cause programs to stop working, or work unpredictably over time, and they can also prevent multiple programs from running concurrently.
Here's an example of a memory leak: (from Wikipedia):
When a button is pressed: Get some memory, which will be used to remember the floor number Put the floor number into the memory Are we already on the target floor? If so, we have nothing to do: finished Otherwise: Wait until the lift is idle Go to the required floor Release the memory we used to remember the floor number
The memory leak would occur if the floor number requested is the same floor that the lift is on; the condition for releasing the memory would be skipped. Each time this case occurs, more memory is leaked.
During the rapid churn of development, it is common to reset the state of the program before resource leaks can be detected. This results in a sawtooth pattern where each drop indicates a restart of the program, and the gradual rises over time indicate a leak.
To diagnose resource leaks, don't restart the application too often. Instead, let it rest for a few days. Take detailed readings of memory, CPU, and disk utilization before walking away from the machine, and then compare the numbers after several days at rest. Of course, you may find it very beneficial to keep track of utilization over the course of the weekend, but what you are looking for is reaffirmation that when you come back, your program is using the same resources as when you left.
In terms of increasing testability of the software, you can make testability easier by adding log files indicating when new resources are being consumed or released. (This does not necessarily have to be user facing.) There are many tools available for this purpose on multiple platforms. For Windows, the Windows SysInternals toolkit is fantastic for monitoring resource consumption. For Unix, htop and qps are also excellent.
Resource Leaks Outside Your Application Code
Most testers also need to be able to detect resource leaks outside of the application code. How does that work, you ask? If you think about the entire set of components required to deliver an application, what some call the "application delivery chain," then certain issues cause instability and poor performance the same way that a resource leak in code does.
For example, if some servers are configured for IPv6 instead of IPv4, that can cause DNS issues, TCP timeouts, and add seconds of latency from an end user's perspective. Alternately, poorly written queries can lead to poor database performance.
To identify and fix these "resource leaks" in your application delivery chain, you will want to look at a solution like ExtraHop. When analyzing traffic in a test environment, the ExtraHop platform can uncover inefficiencies or misconfigurations that are not covered by software testing tools but would cause problems in production.
This is a companion discussion topic for the original entry at https://www.extrahop.com/community/blog/2015/quiet-time-the-secret-to-finding-resource-leaks/