Question about DPC_WATCHDOG_VIOLATION causing BSOD and Event ID 153, 14, 0 crashes with nvlddmkm.sys crash.

B

Blaster12121

11-03-2025, 09:49 AM #1

Hello! I have pretty much been going through this hell of an issue for almost a month now. I got a new build around a month ago. For the first 2-3 weeks, everything was working flawlessly: gaming at ultra settings, no stutter, no lag, and most importantly, no crashes. Just after that, one day, I got a BSOD about DPC_WATCHDOG_VIOLATION around 5 minutes into a game. Fast forward to this day, I can't run any game stable for less than 10 minutes. The same thing applies when I try to export a video. In Event Viewer I get errors about nvlddmkm, especially Event ID 0, 14 and 153, which I will go in more detail as this thread continues.
The system specs are as listed below:
CPU: Intel Core i9-14900k
GPU
:
ZOTAC RTX 4090 AMP Extreme Airo
RAM: 32 GB DDR5-4800, Kingston KF552C40-32 + 32 GB DDR5-4800, Kingston KF552C40-32 (64 GB RAM in total)
SSD: CT2000P5PSSD8, 1863.02 GB
All-in-One Cooler for the CPU
PSU: bequiet! Pure Power 12M 1000W
Motherboard: MSI PRO B760M-P
OS: Windows 11, 24H2
Monitor: Dell S2419HGF, 1920x1080, 144Hz
I have noticed the same issue being reported many times, especially for 4090. Here is what I have done so far:
Enabled & Disabled XMP in BIOS (issue persists) ;
Disabled & Enabled Hyperthreading, Turbo Settings regarding CPU (issue persists, no difference) ;
Used DDU to uninstall the driver in safe boot mode and installed an older version [537.58] (This did make games more stable but only a few, tested some other games, they would still crash with Event ID 0) ;
Changed permissions for nvlddmkm.sys file to Full Control for Users (issue remain) ;
Turned on Debug Mode in Nvidia Control Panel
Switched to "Prefer Maximum Performance" in Nvidia Control Panel (no difference) ;
Disabled Hardware Accelerated GPU Scheduling in Windows settings (This caused no BSODs but the crashes remain)
Used MSI Afterburner to undervolt the GPU core & memory clocks for around -52 MHz (no difference) ;
Changed PCIe Gen Mode in BIOS to both 4.0 and 3.0 (no difference) ;
Uninstalled programs like G-Hub and Wallpaper Engine, switched HAGS off for the other programs that supported it (no difference);
Disabled Integrated GPU in Device Manager (issue still persists);
Uninstalled NVIDIA HD Audio in Device Manager (yet again no difference);
Disabled High Precision Event Timer (others said it was the only workaround it, no difference whatsoever);
Ran OCCT tests for every single component, even at extreme. What crashed into the BSOD was VRAM. Could this indicate a hardware issue?
I tried to DDU even the latest drivers, it did not change anything. I've also seen reports of 566.36 being the most stable driver for 4090 but that also did not change anything. As for the errors in Event Viewer, I get these 3-4 specific errors from source nvlddmkm:
The description for Event ID 153 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
\Device\Video3
Error occurred on GPUID: 100
The message resource is present but the message was not found in the message table.
The description for Event ID 14 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
\Device\Video3
badfbadf(badfbadf) 00000000 00000000
The message resource is present but the message was not found in the message table
The description for Event ID 0 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
\Device\Video3
Error occurred on GPUID: 100
As for the BSODS, the bugcheck error in Event Viewer is:
The computer has rebooted from a bugcheck. The bugcheck was: 0x00000133 (0x0000000000000001, 0x0000000000001e00, 0xfffff80784dc43b0, 0x0000000000000000).
Something to notice is that I would get Event ID 153 error on the latest drivers only, but either way, I have been pulling my hair out trying to find any solution available. If anyone has any idea or has been having the same problem, I'd really appreciate any help! Thank you so much for your time!

Reply

R

RememberDis

11-03-2025, 09:49 AM #2

This page has a comparable design and the main issue is nearly identical to what you mention.

Reply

I

irvinIRS

11-03-2025, 09:49 AM #3

Combined the ram. Take out the stick that isn't included in the package.
If you need 64GBs, purchase a kit with that capacity and think about disposing of the existing ones to offset some expenses.
This board looks quite simple, so it works well with a 14900K; the problems occurring within ten minutes suggest heat might be the cause.
The i9 needs a motherboard with at least heatsinks over the VRM area, otherwise the CPU won't perform optimally.
Have ambient temperatures increased since then, haven't you noticed?

Reply

L

LethalStats

11-03-2025, 09:49 AM #4

Both RAM modules are identical, each containing a 64 GB kit (two 32 GB units). I plan to remove one stick to check the situation and will provide an update.

Regarding the CPU and motherboard, MSI states that PRO B760M-P is compatible with 14th/13th and 12th generation Intel processors. However, I’m uncertain if it’s the ideal fit for an i9. For a balanced setup, I have an All-in-One liquid cooler for the CPU, and temperatures typically stay within 55-70 degrees under load.

As for ambient conditions, I maintain a cool room environment, which rules that out.

I attempted to execute intensive benchmark tests using OCCT for all components. The CPU and RAM performed well, the PSU was stable, and the GPU showed no issues. The system crashed with a DPC_WATCHDOG_VIOLATION during a VRAM benchmark test within 40 seconds.

It seems the problem likely lies with the GPU hardware, the motherboard, or the connection between the GPU and PSU, though I find this unlikely.

Thank you for your response!

Reply

J

JPP_Miam

11-03-2025, 09:49 AM #5

AH. Considering how the specifications were entered, I assumed the sticks came in a different package. My mistake!

Reply

C

CptShroom

11-03-2025, 09:49 AM #6

Thank you for your message. I understand your point; MSI and other vendors often don't provide honest information. The issues I encountered with the bloatware on MSI and other products really made me reconsider upgrading this motherboard. I reinstalled Windows and used the default MSI driver, which automatically updates but adds a lot of unnecessary software.

Regarding the CPU, I noticed it was limited to 153W in BIOS settings. When I tried to increase it to 253W, the system would crash every time I loaded the desktop. I attempted to set a higher limit for P-Cores (from 57x to 53x), but crashes persisted. A recent physics benchmark showed the CPU performance improving rapidly instead of maintaining stability.

Here are some default BIOS settings:
https://drive.google.com/drive/folders/1...sp=sharing

Reply

S

samosaara

11-03-2025, 09:49 AM #7

What temperature levels do the GPU core, hot spot, and memory junction reach? Have you attempted a full uninstall and reinstall of the GPU? With such low turbo power limits, the board is likely to throttle and cause the CPU to slow down as well—this might lead to minor performance drops or occasional sluggishness, not necessarily a crash. Unfortunately, I don’t have a reference guide, but you could try adjusting the core voltage offset (check the third snapshot for values near -) or search for undervolt tutorials. Set core voltage offset to negative values and core voltage offset to 0.05v. What kind of crashes are you experiencing? Do you see a crash to desktop, a black screen with sound, or freezing of the screen?

Reply

D

DarkcuT

11-03-2025, 09:49 AM #8

I conducted an intense stability assessment on the GPU to monitor its temperatures. The results showed core temps ranging from 70°C to 75°C, memory junctions between 69°C and 72°C, while hotspots fluctuated from 80°C to 87°C. I’m uncertain if this points to a motherboard-CPU incompatibility, but I remain hopeful it isn’t a GPU-related problem (though the likelihood seems highest here 😭).

Before updating Windows 11, I experienced performance drops when running games like Cyberpunk 2077 and Forza Horizon 5—playing them lasted under five minutes at most. Enabling hardware acceleration still triggered a DPC_WATCHDOG_VIOLATION Blue Screen of Death. Turning off HAGS didn’t eliminate the BSODs entirely, but it reduced their frequency. Without HAGS, I’d also face short-lived gameplay sessions, system freezing, background audio, screen blackouts after ten seconds, and automatic redirection back to my desktop.

In Event Viewer, I identified errors linked to nvlddmkm.sys: Event ID 153 & 14 (with newer drivers) and Event ID 0 (with older drivers). When BSODs occurred, the description noted: “The computer has rebooted from a bugcheck. The bugcheck was: 0x00000133 (0x0000000000000001, 0x0000000000001e00, 0xfffff804bbfc43b0, 0x0000000000000000).”

The code x133 corresponds to DPC_WATCHDOG_VIOLATION, suggesting a GPU/driver issue. The Event ID 153 & 14 errors describe: “The following information was included with the event: …” and “The device video3 reset occurred on GPUID:100.”

I attempted to power off the PC completely and reinsert the GPU into its PCIe slot, but problems continued. I reinstalled drivers across all versions to test stability; all failed except version 537.58, which showed the least issues.

I’m exhausted from the situation—only a month had passed since acquiring this PC, yet it’s struggling so severely. It makes you wonder about the decisions made. 🥲

Thank you for your time; your help means a lot!

Reply

O

ozmonster12

11-03-2025, 09:49 AM #9

Yeah, I'm sorry I haven't been very useful.
Those are all really good.
But there could be several factors at play, which would make it difficult to fix.
The main issue seems to be memory. After looking at other threads, I'm not sure what's happening, especially with the various individual solutions or unresolved ones.

Reply

S

Srules234

11-03-2025, 09:49 AM #10

I really value your patience and won’t let you worry too much. I intend to bring the PC to the technician team for a thorough check, since troubleshooting can be quite frustrating—especially when the issue shifts unexpectedly. 😭
Before I hand it over, I’ll attempt a few more checks. During my last inspection inside the case, I saw the LED controller cables tightly bundled near the PSU, which might be affecting the 12VHPWR connection. I’m now worried about a possible defect or improper fit in the 12-pin connector between the PSU and GPU, though I’m not sure.

Reply