.Net 下跟踪线程挂起和程序死循环

下的程序调试相对C/C++要简单很多,少了那些令人头疼的指针越界的问题。不过当你的程序遇到如下问题时,依然非常棘手:

进程异常终止。解决方案见 .Net 下未捕获异常的处理

内存泄漏或者内存申请后程序始终没有释放。解决方案见 用 .NET Memory Profiler 跟踪.net 应用内存使用情况--基本应用篇 。如果通过自己编写的程序监控,我将在以后的文章中阐述。

线程因未知原因挂起,比如死锁。

程序死循环。

本文将阐述如果编写程序对后两者故障实时跟踪并报告。

首先我们需要一个单独的监控线程来监控需要监控的线程

我做了一个监控类 ThreadMonitor,在开始监控之前,我们将监控线程的优先级设置为最高。

publicThreadMonitor()
{
_MonitorThread
=newThread(newThreadStart(MonitorTask));
_MonitorThread.Priority
=ThreadPriority.Highest;
_MonitorThread.IsBackground
=true;

}

接下来我们为这个线程提供几个公共方法

方法让调用者启动监控

方法用于将需要监控的线程注册到监控列表中

方法后面说明

///
///Startmonitor
///

publicvoidStart()
{
_MonitorThread.Start();
}



///
///Monitorregister
///
///Monitorparameter

publicvoidRegister(MonitorParametermonitorPara)
{
Debug.Assert(monitorPara
!=null);
Debug.Assert(monitorPara.Thread
!=null);

if(GetTCB(monitorPara.Thread)!=null)
{
thrownewSystem.ArgumentException("Registerrepeatedly!");
}


lock(_RegisterLock)
{
_TCBTable.Add(monitorPara.Thread.ManagedThreadId,
newTCB(monitorPara));
}

}


publicvoidHeartbeat(Threadt)
{
TCBtcb
=GetTCB(t);
if(tcb==null)
{
thrownewSystem.ArgumentException("Thisthreadwasnotregistered!");
}


tcb.LastHeartbeat
=DateTime.Now;
tcb.HitTimes
=0;
tcb.Status
&=~ThreadStatus.Hang;
}

下面让我来说说如何监控某个线程挂起。

监控线程提供了一个心跳调用 Heartbeat ,被监控的线程必须设置一个定时器定时向监控线程发送心跳,如果监控线程在一定时间内无法收到这个心跳消息,则认为被监控线程非正常挂起了。这个时间又MonitorParameter参数的HangTimeout指定。

光监控到线程挂起还不够,我们必须要报告线程当前挂起的位置才有实际意义。那么如何获得线程当前的调用位置呢?.Net framework 为我们提供了获取线程当前堆栈调用回溯的方法。见下面代码

privatestringGetThreadStackTrace(Threadt)
{
boolneedFileInfo=NeedFileInfo;

t.Suspend();
StackTracestack
=newStackTrace(t,needFileInfo);
t.Resume();

returnstack.ToString();
}

这里需要说明的是StackTrace(t, needFileInfo) 必须在线程t Suspend后 才能调用,否则会发生异常。但Thread.Suspend 调用是比较危险的,因为调用者无法知道线程t挂起前的运行状况,可能线程t目前正在等待某个资源,这时强制挂起,非常容易造成程序死锁。不过值得庆幸的是StackTrace(t, needFileInfo)的调用不会和其他线程尤其是调用线程产生资源冲突,但我们必须在这一句执行结束后迅速调用 t.Resume 结束线程t的挂起状态。

谈完了对线程非正常挂起的监控,再谈谈对程序死循环的监控。

在决定采用我现在的这个方案之前,我曾经想通过 GetThreadTimes 这个API 函数得到被监控线程的实际CPU运行时间,通过这个时间来计算其CPU占有率,但很遗憾,我的尝试失败了。通过非当前线程下调用 GetThreadTimes 无法得到对应线程的CPU时间。(好像非托管线程可以,但.Net的托管线程我试了,确实不行,但原因我还没弄明白)另外GetThreadTimes 统计不够准确 见 对老赵写的简单性能计数器的修改续- 关于

所以没有办法,我采用了一个不是很理想的方案

定时统计当前进程的TotalProcessorTime 来计算当前线程的CPU占有率,如果这个CPU占有率在一段时间内大于 100 / (CPU 数)* 90% ,则认为当前进程出现了死循环。这个测试时间由 MonitorParameter参数的DeadCycleTimeout 属性指定。

这就出现了一个问题,我们只知道程序死循环了,但不知道具体是那个线程死循环,那么如何找到真正死循环的线程呢?

我采用的方法是每秒钟检测一次线程当前状态,如果当前状态为运行状态则表示命中一次,在确认出现死循环后我们在来检查在一个检查周期内的命中次数,如果这个命中次数足够高,则认为是该线程死循环了。不过这样还是有问题,主线程在等待windows 消息时 或者控制台程序线程在等待控制台输入时,该线程的状态居然始终是 Runing ,其实是阻塞了,但我没有找到一个很好的方法来得到线程当前处于阻塞状态。怎么办?我想了个笨办法,就是在上面两个条件都符合的情况下再看看在此期间有没有心跳,如果没有心跳,说明死循环了。但如果有心跳也不一定就没有死循环,遇到这种情况,就将可疑的都全部报告了,靠人来判断吧。

我写了一个示例代码,代码中有一个Winform 主线程 和 一个计数器线程,计数器线程每秒记一次数,并更新界面。监控线程检查到非正常挂起或者死循环,将在当前目录下写一个Report.log 输出监控报告。

点击Hang后主线程休眠20秒,计数器线程由于要更新界面,也同样会被挂起。

监控线程检查到两个线程挂起后报告如下:

ThreadMonitorEvent
Thread Name:Main thread
Thread Status:Hang
Thread Stack: at System.Threading.Thread.SleepInternal(Int32 millisecondsTimeout)
at System.Threading.Thread.Sleep(Int32 millisecondsTimeout)
at DotNetDebug.Form1.buttonHang_Click(Object sender, EventArgs e)
at System.Windows.Forms.Control.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
at System.Windows.Forms.Control.WndProc(Message& m)
at System.Windows.Forms.ButtonBase.WndProc(Message& m)
at System.Windows.Forms.Button.WndProc(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
at System.Windows.Forms.UnsafeNativeMethods.DispatchMessageW(MSG& msg)
at System.Windows.Forms.Application.ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32 dwComponentID, Int32 reason, Int32 pvLoopData)
at System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(Int32 reason, ApplicationContext context)
at System.Windows.Forms.Application.ThreadContext.RunMessageLoop(Int32 reason, ApplicationContext context)
at System.Windows.Forms.Application.Run(Form mainForm)
at DotNetDebug.Program.Main()
at System.AppDomain._nExecuteAssembly(Assembly assembly, String[] args)
at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()


2:38:40 PM
ThreadMonitorEvent
Thread Name:Counter thread
Thread Status:Hang
Thread Stack: at System.Threading.WaitHandle.WaitOneNative(SafeWaitHandle waitHandle, UInt32 millisecondsTimeout, Boolean hasThreadAffinity, Boolean exitContext)
at System.Threading.WaitHandle.WaitOne(Int64 timeout, Boolean exitContext)
at System.Threading.WaitHandle.WaitOne(Int32 millisecondsTimeout, Boolean exitContext)
at System.Windows.Forms.Control.WaitForWaitHandle(WaitHandle waitHandle)
at System.Windows.Forms.Control.MarshaledInvoke(Control caller, Delegate method, Object[] args, Boolean synchronous)
at System.Windows.Forms.Control.Invoke(Delegate method, Object[] args)
at System.Windows.Forms.Control.Invoke(Delegate method)
at DotNetDebug.Form1.Counter()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()

点击DeadCycle 按钮后,让计数器线程死循环,但主线程不死循环。

监控线程检查到计数器线程死循环后报告如下:

2:37:51 PM
ThreadMonitorEvent
Thread Name:Counter thread
Thread Status:Hang
Thread Stack: at DotNetDebug.Form1.DoDeadCycle()
at DotNetDebug.Form1.Counter()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()


2:37:52 PM
ThreadMonitorEvent
Thread Name:Counter thread
Thread Status:Hang, DeadCycle
Thread Stack: at DotNetDebug.Form1.DoDeadCycle()
at DotNetDebug.Form1.Counter()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()

下面是示例代码在

以下是测试代码。完整源码的下载位置:完整源码

usingSystem;
usingSystem.Collections.Generic;
usingSystem.ComponentModel;
usingSystem.Data;
usingSystem.Drawing;
usingSystem.Text;
usingSystem.Windows.Forms;
usingSystem.Threading;
usingSys.Diagnostics;

namespaceDotNetDebug
{
publicpartialclassForm1:Form
{
Thread_CounterThread;
ThreadMonitor_ThreadMonitor
=newThreadMonitor();
bool_DeadCycle=false;

delegatevoidCounterDelegate();

privatevoidDoDeadCycle()
{
while(_DeadCycle)
{
}

}


privatevoidCounter()
{
intcount=0;
while(true)
{
DoDeadCycle();
labelCounter.Invoke(
newCounterDelegate(delegate(){labelCounter.Text=(count++).ToString();}));
_ThreadMonitor.Heartbeat(Thread.CurrentThread);

Thread.Sleep(
1000);
}

}


publicForm1()
{
InitializeComponent();
}


voidOnThreadMonitorEvent(objectsender,ThreadMonitor.ThreadMonitorEventargs)
{
StringBuildersb
=newStringBuilder();

sb.AppendLine(DateTime.Now.ToLongTimeString());
sb.AppendLine(
"ThreadMonitorEvent");
sb.AppendLine(
"ThreadName:"+args.Name);
sb.AppendLine(
"ThreadStatus:"+args.Status.ToString());
sb.AppendLine(
"ThreadStack:"+args.StackTrace);

using(System.IO.FileStreamfs=
newSystem.IO.FileStream("report.log",System.IO.FileMode.Append,
System.IO.FileAccess.Write))
{
using(System.IO.StreamWritersw=newSystem.IO.StreamWriter(fs))
{
sw.WriteLine(sb.ToString());
}

}

}



privatevoidForm1_Load(objectsender,EventArgse)
{
_ThreadMonitor.ThradMonitorEventHandler
+=
newEventHandler<ThreadMonitor.ThreadMonitorEvent>(OnThreadMonitorEvent);

_CounterThread
=newThread(newThreadStart(Counter));
_CounterThread.IsBackground
=true;


_ThreadMonitor.Register(
newThreadMonitor.MonitorParameter(
Thread.CurrentThread,
"Mainthread",10000,5000,
ThreadMonitor.MonitorFlag.MonitorHang
|
ThreadMonitor.MonitorFlag.MonitorDeadCycle));

_ThreadMonitor.Register(
newThreadMonitor.MonitorParameter(
_CounterThread,
"Counterthread",
ThreadMonitor.MonitorFlag.MonitorHang
|
ThreadMonitor.MonitorFlag.MonitorDeadCycle));

_CounterThread.Start();

timerHeartbeat.Interval
=1000;
timerHeartbeat.Enabled
=true;

_ThreadMonitor.Start();
}


privatevoidtimerHeartBeat_Tick(objectsender,EventArgse)
{
_ThreadMonitor.Heartbeat(Thread.CurrentThread);
}


privatevoidButtonDeadCycle_Click(objectsender,EventArgse)
{
_DeadCycle
=true;
}


privatevoidbuttonHang_Click(objectsender,EventArgse)
标签: 线程跟踪循环
------分隔线----------------------------
.NET点击排行
· 首页 · 注册

百鸣[Baiming.org]欢迎您 百鸣[Baiming.org]欢迎您~