Question PC Crashes. Is there a better method to identify the problem?
Question PC Crashes. Is there a better method to identify the problem?
Specs
GPU: EVGA RTX 3080 XC3 Ultra Gaming (2020)
CPU: Ryzen 7 5800x (2021)
Cooler: Arctic Liquid Freezer II 280 AIO (2020)
MOBO: ASUS ROG X570 Crosshair VIII Hero WiFi. (2021)
RAM: 2x8GB G.Skill TridentZ RGB DDR4-3200 14 CAS (2020)
PSU: ASUS ROG Strix 750w 80 Plus Gold (2022)
Windows Drive: 2TB: Crucial MX500 SSD (2019)
HDD: Seagate Barracuda 4TB 5400RPM (2021)
Monitor 1: Acer Predator XB271HU bmiprz 27" 1440p 144hz. (2019)
Monitor 2: LG 27UL500-W 27" 4K 60hz (2019)
OS: Windows 10
Previously had windows installed on Sabrent 1TB rocket NVMe m.2 drive. (2020)
Previously used a Seasonic Focus+ 750W (2019)
Giving a full rundown of the issues with this damn thing the last few years. In November 2022 I started getting hard crashes where the PC would restart itself. No bluescreen that I could see and there were no Minidumps being created. I ended up buying a PSU replacement and that seemed to fix everything for a time. Then in March of this past year I started having issues with crashing again. This time at least involving a BSOD. Still no Minidump files. Saw people saying that it could be a failing m.2 drive so I experimented and did a clean install of windows onto the M.2 that I had been using as my boot drive for a long time. Crashes continued. Removed the m.2 and did a clean install onto my SSD and the crashes stopped. So I bought a new m.2 but I've been too lazy to actually do another clean windows install and plug in the new m.2.
Now, in the last couple of weeks I've been having issues with my PC crashing while gaming again. Screens go black, sound keeps going for a few seconds before cutting out and the PC restarts. It's at least generating minidumps now which you can see below. I've only run 2 through windowsdbg, but they both said pretty much the same thing. Video TDR Failure. When I played Skyrim on it, it was mostly fine. It crashed a couple of times when I had it running in fullscreen borderless window at 4k. But it ran fullscreen 4K just fine as well as fullscreen or borderless window on my 1440p monitor. Hell Let Loose wouldn't even make it to the main menu before crashing. Alien Isolation would crash within 5 minutes or so of playing. 3D Mark Timespy Extreme would cause a crash as soon as I hit run test.
I tried re-seating the GPU and everything worked flawlessly after that. I was able to play several hours of Alien and finish out that game. I got through a whole match on Hell Let Loose. Ran a full 20 loops on Timespy Extreme without issue. Bought Kingdom Come Deliverance and played for an hour or so without issue. I paused the game and stepped away for a while and came back a little later. The monitors had gone to sleep so I jiggled the mouse and put my headphones on. There was still sound coming through from the game. The monitors didn't seem to wake up and then after a few seconds the sound cut out in the headset and my PC restarted itself. I launched the game again and made it about 5 or 10 minutes before it black screened and restarted again. Tried to do a full shut down and start up instead of just a restart to see if that would do anything. Then the game wouldn't even make it to the main menu before crashing.
I'm gonna try and do a clean install of windows tonight and see what that does. If it's still causing issues I've got a friend who offered to bring over his 2070 Super to test out. I'm at a loss though. Is there really no way to figure out if this is a software or hardware issue (as well as which piece of hardware is at fault) beyond just swapping out parts to see if one fails? Not to mention based off the previous scenario of gaming just fine for days and then regressing back to having crashing issues - makes it impossible to know if borrowing my friend's GPU for a few hours will tell me anything.
Is it normal to have this many issues over just 3 years with a PC. This feels insane.
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
VIDEO_TDR_FAILURE (116)
Attempt to reset the display driver and recover from timeout failed.
Arguments:
Arg1: ffff800f6b750010, Optional pointer to internal TDR recovery context (TDR_RECOVERY_CONTEXT).
Arg2: fffff80398a7f690, The pointer into responsible device driver module (e.g. owner tag).
Arg3: ffffffffc000009a, Optional error code (NTSTATUS) of the last failed operation.
Arg4: 0000000000000004, Optional internal context dependent data.
Debugging Details:
------------------
Unable to load image \SystemRoot\System32\DriverStore\FileRepository\nv_dispi.inf_amd64_1e8724cced6e93d4\nvlddmkm.sys, Win32 error 0n2
*** WARNING: Unable to verify timestamp for nvlddmkm.sys
KEY_VALUES_STRING: 1
Key : Analysis.CPU.mSec
Value: 1250
Key : Analysis.Elapsed.mSec
Value: 2506
Key : Analysis.IO.Other.Mb
Value: 0
Key : Analysis.IO.Read.Mb
Value: 1
Key : Analysis.IO.Write.Mb
Value: 3
Key : Analysis.Init.CPU.mSec
Value: 265
Key : Analysis.Init.Elapsed.mSec
Value: 18742
Key : Analysis.Memory.CommitPeak.Mb
Value: 160
Key : Analysis.Version.DbgEng
Value: 10.0.27725.1000
Key : Analysis.Version.Description
Value: 10.2408.27.01 amd64fre
Key : Analysis.Version.Ext
Value: 1.2408.27.1
Key : Bugcheck.Code.LegacyAPI
Value: 0x116
Key : Bugcheck.Code.TargetModel
Value: 0x116
Key : Failure.Bucket
Value: 0x116_IMAGE_nvlddmkm.sys
Key : Failure.Hash
Value: {c89bfe8c-ed39-f658-ef27-f2898997fdbd}
Key : WER.OS.Branch
Value: vb_release
Key : WER.OS.Version
Value: 10.0.19041.1
BUGCHECK_CODE: 116
BUGCHECK_P1: ffff800f6b750010
BUGCHECK_P2: fffff80398a7f690
BUGCHECK_P3: ffffffffc000009a
BUGCHECK_P4: 4
FILE_IN_CAB: 020325-12687-01.dmp
FAULTING_THREAD: ffff800f643265c0
VIDEO_TDR_CONTEXT: dt dxgkrnl!_TDR_RECOVERY_CONTEXT ffff800f6b750010
Symbol dxgkrnl!_TDR_RECOVERY_CONTEXT not found.
PROCESS_OBJECT: 0000000000000004
BLACKBOXBSD: 1 (!blackboxbsd)
BLACKBOXNTFS: 1 (!blackboxntfs)
BLACKBOXPNP: 1 (!blackboxpnp)
BLACKBOXWINLOGON: 1
CUSTOMER_CRASH_COUNT: 1
PROCESS_NAME: System
STACK_TEXT:
fffff98a`9d6479d8 fffff803`7e6668de : 00000000`00000116 ffff800f`6b750010 fffff803`98a7f690 ffffffff`c000009a : nt!KeBugCheckEx
fffff98a`9d6479e0 fffff803`7e616fa4 : fffff803`98a7f690 ffff800f`68b02720 00000000`00002000 ffff800f`68b027e0 : dxgkrnl!TdrBugcheckOnTimeout+0xfe
fffff98a`9d647a20 fffff803`7e60fadc : ffff800f`68adb000 00000000`01000000 00000000`00000004 00000000`00000004 : dxgkrnl!ADAPTER_RENDER::Reset+0x174
fffff98a`9d647a50 fffff803`7e666005 : 00000000`00000100 ffff800f`68adba70 00000000`63a4e700 fffff803`70ab499c : dxgkrnl!DXGADAPTER::Reset+0x4dc
fffff98a`9d647ad0 fffff803`7e666177 : fffff803`71525440 ffff800f`6a131d70 00000000`00000000 00000000`00000100 : dxgkrnl!TdrResetFromTimeout+0x15
fffff98a`9d647b00 fffff803`70a171c5 : ffff800f`643265c0 fffff803`7e666150 ffff800f`5a69e980 ffff800f`00000000 : dxgkrnl!TdrResetFromTimeoutWorkItem+0x27
fffff98a`9d647b30 fffff803`70b5a165 : ffff800f`643265c0 00000000`00000080 ffff800f`5a6be200 000fe067`b4bbbdff : nt!ExpWorkerThread+0x105
fffff98a`9d647bd0 fffff803`70c078f8 : fffff803`6ba51180 ffff800f`643265c0 fffff803`70b5a110 04d172e8`8f1a54c8 : nt!PspSystemThreadStartup+0x55
fffff98a`9d647c20 00000000`00000000 : fffff98a`9d648000 fffff98a`9d641000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x28
SYMBOL_NAME: nvlddmkm+184f690
MODULE_NAME: nvlddmkm
IMAGE_NAME: nvlddmkm.sys
STACK_COMMAND: .process /r /p 0xffff800f5a6be200; .thread 0xffff800f643265c0 ; kb
FAILURE_BUCKET_ID: 0x116_IMAGE_nvlddmkm.sys
OS_VERSION: 10.0.19041.1
BUILDLAB_STR: vb_release
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
FAILURE_ID_HASH: {c89bfe8c-ed39-f658-ef27-f2898997fdbd}
Followup: MachineOwner
Welcome to the forums, newcomer!
Advice for those who owned a RTX3080 was to choose a stable PSU with a minimum of 850W or 1KW to handle the GPU's sudden load increases.
What BIOS version is your motherboard running?
For a clean OS installation, recreate the bootable USB installer, disconnect all drives except the one you want to install, run the installer in offline mode, and then install required drivers via an elevated command—right-click the installer and select Run as Administrator.
Currently, you might attempt to use DDU in Safe Mode to remove all GPU drivers (Intel, AMD, Nvidia), followed by manually installing the newest driver from Nvidia's support site through an elevated command.
Sorry worth noting that I've already run DDU a few times. I'm sure I can always attempt it again. My BIOS was the latest available in November 2022 for my board. Prior to swapping the PSU, I updated it to test it.
Are you asking whether it functions just in games and GPU tests, and what occurs when using Cinebench or Prime95?
I used Prime95 a few years back before fixing problems with the new PSU. I didn’t encounter any issues then, but I haven’t run it recently. It seems unlikely the problem is with the CPU since it was working fine previously.
It didn’t cost anything to retry just to verify stability. The bug report indicates the system failed to load the driver image. Possible causes include a CPU issue, a corrupted file, or a GPU problem. A simple test—running Prime for about 10 minutes—would help confirm if instability persists. Would you like to know more about troubleshooting steps?
Dealing with unexpected crashes is annoying! It would be great to have a quicker method to identify the problem rather than spending time on logs and hardware checks.
I ran Prime95 for over ten minutes without any problems. I borrowed a friend's 2070 Super, but his card was too large for my case. After reinstalling Windows (to the M.2 drive I bought almost a year ago, which I hadn’t replaced before), the crashes continued but no mini dump was generated. This wasn’t happening before.
I successfully completed stress tests in 3D Mark for Steel Nomad and Timespy extreme, running 20 loops each time without issues.
I suspect an error occurred during the Windows installation to the M.2 drive. Earlier, I faced two crashes: one fixed by replacing the PSU, the other by removing the M.2 drive and installing Windows on my SSD. After that, I bought a new M.2 but didn’t install it until now. Since I’m reinstalling Windows, I thought this might be a good chance to install the latest M.2 version too.
I’m trying to determine if the problem lies with the M.2 slot itself or if it’s related to my GPU, operating system, motherboard, or power supply. If I have to, I might need to take it to a repair shop for a proper diagnosis. Without replacing everything at once, I’m running out of options.
A power supply is one of those parts that tends to wear out over time and may not always provide the expected performance. From what I understand, Nvidia needed a 750 watt power supply. However, at that stage, I think it should still function, though you might be near its limits.
For instance, when I purchased my 7900xtx last year, even though it needed an 800 watt unit and I had an 850 watt PSU, I upgraded to a 1200 watt model. Initially, I planned for a 1000 watt unit, but Newegg offered a tier A rated 1200 watt Thermaltake for a similar price to the others I considered. Personally, I tend to increase the power supply slightly. Because components naturally degrade and can struggle to maintain the same output when pushed to their limits, it’s wise to consider upgrading. Also, opting for a higher wattage unit could accommodate future upgrades you might want to make.
It might be useful to check the warranty of the PSU you had and request an RMA if needed, so you can either keep a spare or return the old one for another purchase.