Fix Problem with CUDA causing Graphics Driver to Reset



Hello there,
This post is all about fixing a problem that will give you a message something like this:
Click here to view full size

If you have a very computationally expensive CUDA kernel that takes 5+ seconds to run, windows may give you a message that "Display driver stopped responding and has recovered".
There is a way to solve this problem, and it is NOT by disconnecting all monitors from the graphics card running the CUDA program as has been suggested on a number of forums (I have tried this)

The solution is to disable the timeout detection and recovery of GPUs through WDDM as shown on this site and described below.

First you need to open up your registry editor by typing in "regedit" into run or just do a search as shown and click the only listed item.

Click here to view full size

You will now have a windows open like this but not opened up like my current tree:
Click here to view full size

You need to go to that directory as shown in the picture: HKEY_LOCAL_MACHINE->System->CurrentControlSet->Control->GraphicsDrivers

There you need to edit (in my case I had to create it) the "TdrLevel" (if you create it its a "New->DWORD"). Make sure that it has a value of 0 to disable it.

Reboot the machine (turn it off and on again).

Bobs your uncle. Your done! Run your CUDA application and it should be fine if it was indeed a timeout issue. Post comment if you have any problem with these instructions.

No comments:

Post a Comment