Mind maps can be fantastic tools to aid analyzing performance bottlenecks. If done correctly, it not only help identifying bottlenecks faster but also help in creating document analysis tips for rest of the team and for later usage. The mind map below is an example of root cause analysis when page load time is observed to be high.
Most of the time, when response time of a web request is found to be high, performance tester starts digging out the fault on server side. It is not always true to have the issue lying at server side and on database server. The issue can exist anywhere in the delivery chain. I usually follow below order while identifying the issue of high response time.
Step 1: Issue with Tool / Scripts
Before checking anywhere else, it is advisable to validate the scripts and tools being used for performance testing.
» Does high response time include think time unknowingly?
» Is there redundant logic (e.g. client side functionality) inside transaction for which response time is reported?
» Which tool is being used for load generation? What is the logic for calculating response time?
» Is there any extended logging being used? Extended logging will skew the response time.
» Has browser (number of parallel connections) been emulated realistically?
Step 2: Issue with Infrastructure where load generators are installed
It is important to monitor hardware performance where load generators are installed. The response time information might be incorrectly stated when low configuration machines are used.
» Are hardware utilization (CPU, Memory, I/O) overly stressed?
» Are there issues (Latency, Packet Loss and Bandwidth) with network?
Step 3: Issue with client side logic and web page resources
Response time can be improved by optimizing the client side logic first before tweaking the server side logic
» Are images being scaled unnecessarily? Avoid scaling of images in HTML reduces the bandwidth usage and round trip time
» Is CDN being used for serving the static resources? CDN helps in serving the static resources faster
» Is JS / CSS blocking loading of other resources? Loading of JS / CSS can be deferred to unblock other resources
» Is unused code also being downloaded? Apart from performance, it risks security as well
» Are there many third party ads being loaded? Is third party ad causing slow response time?
» Is caching being used properly? Caching helps in serving the requests faster.
» Is cookie less domain being used for static resources? It helps in optimizing the bandwidth.
» Are images being loaded in parallel? How many parallel threads being used?
» Are images optimized for better serving and bandwidth?
» Is CSS sprite being used for loading logos?
» Are resources being compressed before serving?
» Has redundant code and space being removed before serving JS / CSS files?
» Are view states being used properly?
» Is this possible to reduce number of requests by validating content on client side rather than on server side?
Step 4: Issue with load balancer
» Is load balancer algorithm working as expected? Are all requests being distributed evenly on active servers?
Step 5: Issue with Web Server
» Does it require improving hardware (CPU, Memory, I/O) of web server?
» Is caching being utilized properly on web server?
» How threads are being utilized? Are there enough threads allocated in thread pool?
» Are there requests waiting for external resources (database, disk, payment gateway etc.)?
» Is web server configured as per required load? Configuration related to thread pool, number of concurrent users allowed etc.?
Step 6: Issue with Application Server
» Does it require improving hardware (CPU, Memory, I/O) of application server?
» Is JVM configured properly for heap size, thread models, GC scheme and JVM version?
» Is garbage collector working efficiently? Can it be optimized further by sizing the generation and selecting a proper collector?
» Is caching being utilized properly on application server? Connection cache? File cache?
» Is thread pool configured properly? Maximum threads? Queue length?
» Are sessions being configured properly like session time out?
» Are many exceptions stack trace found in debug log file?
» Are asynchronous calls being handled optimistically?
Step 7: Issue with Database Server
» Does it require improving hardware (CPU, Memory, I/O) of database server?
» Are indexes being used properly?
» Are there expensive queries that can be tweaked / improved?
» Are there blocking issues due to locks?
» Are indexes heavily fragmented? Is database in need of defragmentation?
» Is it a replication intensive environment? Is replication is causing slowness?
» Has database statistics been updated? When was the last time database statistics run?