F5F Stay Refreshed Hardware Desktop Unstable system failures occur while using dual RTX 4090 graphics cards during GPU tasks.

Unstable system failures occur while using dual RTX 4090 graphics cards during GPU tasks.

Unstable system failures occur while using dual RTX 4090 graphics cards during GPU tasks.

B
BosnaKingz
Member
166
03-30-2025, 02:49 AM
#1
My setup frequently stops unexpectedly (about 10-20%) when starting certain GPU tasks. I first saw this while running ML model inference, but now it happens consistently with FurMark. The shutdown is sudden—system cuts off instantly, needing a full reset. This doesn’t happen during games or streaming; it’s limited to ML or FurMark tests. I can reliably recreate the problem using this command: ./furmark --demo furmark-vk --gpu-index 0 --width 3840 --height 2160 --benchmark --duration-ms 900000 --furmark-vram-test-gb 3.9 --log-gpu-data --gpu-monitor-print --hw-polling-interval 1000. During testing, the crash appears randomly—sometimes it works, other times the machine freezes completely. GPUs don’t heat up beforehand, and temperature readings stay normal when it happens.

System details:
- OS: Ubuntu 22.04 64-bit (Kernel 6.8.0-47-generic)
- Board: MPG X670E CARBON WIFI (MS-7D70)
- BIOS: 1.80 (Aug 10, 2023)
- GPUs: 2x NVIDIA RTX 4090
- RAM: G.Skill Flare X5 96 GB (2×48 GB DDR5-5600 CL40)
- CPU: AMD Ryzen 9 7950X 4.5 GHz 16-core
- Storage: Samsung 990 PRO 2TB NVMe
- Power: SeaSonic PRIME PX 1600W 80+ Platinum
- Drivers: NVIDIA 555.42.06, CUDA 12.5
- Build: https://pcpartpicker.com/list/NK8qY9

Reproduction steps:
- Hardware checks: wall voltage stayed steady, memtest86+ passed all tests, CPU, RAM, NVMe stress tests succeeded.
- Power throttling: GPUs limited to 150W still doesn’t stop crashes.
- BIOS changes: Disabled Resizable BAR, set PCI and chipset Gen to Gen4—no improvement.
- Testing in separate PCIe slot: same crash pattern.
- System logs: no entries captured because crashes happen too fast.

What might help? I’ve monitored voltage spikes during crashes, runed multiple stress tests, and verified GPU health. If you can isolate the workload or adjust power limits further, it might reveal more clues. Any ideas on next steps?
B
BosnaKingz
03-30-2025, 02:49 AM #1

My setup frequently stops unexpectedly (about 10-20%) when starting certain GPU tasks. I first saw this while running ML model inference, but now it happens consistently with FurMark. The shutdown is sudden—system cuts off instantly, needing a full reset. This doesn’t happen during games or streaming; it’s limited to ML or FurMark tests. I can reliably recreate the problem using this command: ./furmark --demo furmark-vk --gpu-index 0 --width 3840 --height 2160 --benchmark --duration-ms 900000 --furmark-vram-test-gb 3.9 --log-gpu-data --gpu-monitor-print --hw-polling-interval 1000. During testing, the crash appears randomly—sometimes it works, other times the machine freezes completely. GPUs don’t heat up beforehand, and temperature readings stay normal when it happens.

System details:
- OS: Ubuntu 22.04 64-bit (Kernel 6.8.0-47-generic)
- Board: MPG X670E CARBON WIFI (MS-7D70)
- BIOS: 1.80 (Aug 10, 2023)
- GPUs: 2x NVIDIA RTX 4090
- RAM: G.Skill Flare X5 96 GB (2×48 GB DDR5-5600 CL40)
- CPU: AMD Ryzen 9 7950X 4.5 GHz 16-core
- Storage: Samsung 990 PRO 2TB NVMe
- Power: SeaSonic PRIME PX 1600W 80+ Platinum
- Drivers: NVIDIA 555.42.06, CUDA 12.5
- Build: https://pcpartpicker.com/list/NK8qY9

Reproduction steps:
- Hardware checks: wall voltage stayed steady, memtest86+ passed all tests, CPU, RAM, NVMe stress tests succeeded.
- Power throttling: GPUs limited to 150W still doesn’t stop crashes.
- BIOS changes: Disabled Resizable BAR, set PCI and chipset Gen to Gen4—no improvement.
- Testing in separate PCIe slot: same crash pattern.
- System logs: no entries captured because crashes happen too fast.

What might help? I’ve monitored voltage spikes during crashes, runed multiple stress tests, and verified GPU health. If you can isolate the workload or adjust power limits further, it might reveal more clues. Any ideas on next steps?

C
CrazyPimGames
Junior Member
7
03-30-2025, 02:49 AM
#2
How many RAM sticks required and their corresponding speed? Which processor should I use?
C
CrazyPimGames
03-30-2025, 02:49 AM #2

How many RAM sticks required and their corresponding speed? Which processor should I use?

B
banshee45
Senior Member
726
03-30-2025, 02:49 AM
#3
Thanks for the reminder sorry forgot that. I edited the post. This is the whole build: https://pcpartpicker.com/list/NK8qY9
B
banshee45
03-30-2025, 02:49 AM #3

Thanks for the reminder sorry forgot that. I edited the post. This is the whole build: https://pcpartpicker.com/list/NK8qY9