/ net framework 4 - DebugDiag tutorial ~ Java EE Support Patterns

10.29.2011

net framework 4 - DebugDiag tutorial

This case study describes the complete root cause analysis and resolution of a Thread hang problem affecting an ASP.NET 4 Web application.

As a Java Thread Dump analysis expert, my first natural question to approach this problem was: how can I generate a .NET Thread Dump in order to pinpoint our slowdown condition? The answer was found with the Windows dump file and their latest Debug Diagnostic Tool 1.2.

This article will demonstrate how you can take advantage of the new Microsoft Debug Diagnostic Tool 1.2 to investigate and troubleshoot hang Thread problems. For people familiar with Java EE, you will definitely notice similarities vs. Java VM Thread Dump and Heap Dump analysis.

Environment specifications (case study)

-         Web and App server: ASP.NET framework 4 & IIS 7
-         OS: Windows Server 8 Service Pack 2 - 64-bit
-         Platform type: Web application

Monitoring and Troubleshooting tools

** DebugDiag 1.2 can be downloaded for free from Microsoft **

-         Compuware Server Vantage (IIS connection monitoring)
-         Debug Diagnostic Tool 1.2

Problem overview

-         Problem type: Major slowdown and hang Threads was observed from the .net framework 4 servers affecting the overall Web application response time

Such problem was observed especially during high load along with an increase of the active IIS connection count.

Gathering and validation of facts

Similar to Java  EE problems, ASP.NET framework 4 and IIS performance problems require gathering of technical and non technical facts so we can either derived other facts and/or conclude on the root cause. Before applying a corrective measure, the facts below were verified in order to conclude on the root cause:

·        What is the client impact? HIGH; major Web application slowdown
·        Recent change of the affected platform? No
·        Any recent traffic increase to the affected platform? No
·        What is the health of the IIS server? Overall IIS health is OK but an increase of IIS active connections is observed; ultimately causing performance degradation
·        Did a restart of the IIS and application pool resolve the problem? No, this was used as a mitigation strategy only but problem is re-occurring quickly following IIS and application pool recycle

-         Conclusion #1: The problem appears to be isolated at the .NET framework 4 application layer due to an unidentified .NET Thread contention

Diagnostic Debug Tool overview

The Debug Diagnostic Tool (DebugDiag) is designed to assist in troubleshooting issues such as hangs, slow performance, memory leaks or fragmentation, and crashes in any user-mode process. The tool includes additional debugging scripts focused on Internet Information Services (IIS) applications, web data access components, COM+ and related Microsoft technologies.

This is the tool that our team used to successfully pinpoint the source of slowdown and Thread hang condition.

Step #1 – Create a dump file of your affected w3wp process

In order to troubleshoot .NET Thread hangs problems, you first need to generate a dump file of your affected .NET application pool(s) process. This can be done by following the steps below:

-         Start Windows Task Manager by performing a right-click an empty area of the task bar, and then click Start Task Manage
-         Select the Processes tab.
-         Identify and Right-click the name of the process that you want, and then click Create Dump File. If you are prompted for an administrator password or confirmation, type your password or click Continue.
-         A dump file for the process is created in the following folder Drive:\Users\UserName\AppData\Local\Temp
When you receive a message that states that the dump file was successfully created, click OK.

This dump file contains a lot of information on your application pool and ASP.NET runtime environment including memory footprint, Threads, attached client connections etc.

Is it now time to analyze the dump in order to understand your Thread hang condition.

Step #2 – Start the Debug Diagnostic Tool

Start the Debug Diagnostic Tool and load your w3wp dump file.




Step #4 – Launch the Hang Analyzer

Start the Hang Analyzer and wait for completion of the analysis. Please be patient as this process may take several minutes depending of the size of your dump file…




Step #5 – Review the report analysis overview

Once the Hang Analyzer is done, it will automatically open the analysis report in your browser.

This report is quite detailed and will provide you with:

-         An analysis summary (blocked Threads etc.) along with recommendations
-         A .NET Threads summary; including TOP 5 Thread CPU contributors
-         A HttpContext report with Thread Id correlation
-         A detailed .NET Thread Dump (similar to Java VM Thread Dump)
-         A detailed HTTP report (list of all active and stuck HTTP connections)



Step #6 – Review the blocked Thread summary

Now please review your blocked Threads summary.

As you can see below from our case study, the blocked Threads originate mainly from our remote SOSS service. SOSS is a distributed cache service that our application is using and invoking remotely via a WinSock (Socket).


Step #7 – Blocked Threads deep dive analysis

At this point, it is now time to review a few samples of your blocked Threads in order to pinpoint the source of contention. Simply select a Thread id from the summary or scroll down the report to look at the raw .NET Thread Stack Trace data.

This is quite similar to Java Thread Dump analysis but the Debug Diagnostic Tool includes more detail such as description of the hanging call etc. Always start your Stack Trace (.NET Call stack) from bottom up. The first line will always give you the current Thread hang operation and should give you a better indication of the culprit of your problem (hanging Web Service provider, hanging database call etc.).

As you can see for our case study, the source of Thread contention was identified and was caused by a slow response time of our SOSS service (.Sosslib_invoke(UInt32, Void*, SByte*, UInt32, UInt32*)).


Conclusion

I hope this simple tutorial has helped you understand and appreciate the powerful Debug Diagnostic Tool and how you can leverage it to analyze .NET framework 4 hang Thread patterns.

Please don’t hesitate to post any comment or question.

1 comments:

Hi,

Thanks for this tutorial, following the steps you outlined above helped us identify and resolve a deadlock in our software which was causing the handles and threads to max out and cause the system to hang.

Cheers

Post a Comment