Issues with 14900ks in Intel configurations.
Issues with 14900ks in Intel configurations.
Well, I hope I had 14900ks so I could try it out for you. But the stability problem still feels like a mix of bad luck with silicone, mainboard issues, and RAM that can't handle it. You didn’t mention the mainboard or RAM you’re planning to use. Probably, most people won’t run dedicated tasks on specific threads, or they rely on professional CPUs like Threadripper Pro or Xeon for those kinds of jobs. The rest are mainly just for gaming.
@Robchil it could be bad luck, but that's two bad 14900ks in a row I think this does affect gamers though. When you don't use set affinity to run two hyper-threads on one p-core, the scheduler decides which cores to use. As tasks in the game start up and finish, different sets of cores get loaded. If the main game control is single threaded, and then separate threads in the driver get started for shader compilation, there is effectively some random chance you have two threads working on the same p-core whilst the rest of the chip is mostly idle. This will manifest itself as random crashes during shader compilation, maybe one or two every few hours, not all the time, and difficult to detect and diagnose. By setting the thread affinity we are just making the problem more visible, it's not that the problem only affects certain uses, it's that the problem occurs more often. Setup is: ASUS PRO W680-ACE IPMI 2x32GB ECC DDR5 5600 (Kingston server RAM) 2x WD850SN 4TB SSD 1x Samsung 980 Pro 1TB Nvidia RTX A6000 (ampere) Seasonic Prime Titanium 1000W I went with intel because my previous setup with an AMD 5950X had problems sleeping and would hard crash on sleep requiring the PSU to be switched off for 20 seconds before it would boot up again. So every time I walked away from my desk there was about 50% chance of it not resuming from suspend to ram. It's something that seemed to affect multiple AMD chipsets when using Linux and undermined my confidence in them. I may have worked out a fix for this now (disable async sleep mode - but this was tested on different AMD hardware after it had a very similar issue). The 13900ks I had seemed okay, I do remember getting the "out of video ram" error, which on a 48GB card seems unlikely, but wrote it off as a software error and I haven't re-tested yet. Real problems all started when I upgraded to a 14900ks. So before I send yet another 14900ks back to Intel, I was hoping to see if other people had the same problem. With the reports of problems with games like Tekken 8 having problems, I think this is actually the same problem I have identified, in which case the common fix of imposing Intel power limits will reduce the occurrence of the crashes, but not totally eliminate them. I am starting to suspect that if it affected two 14900ks and my previous 13900ks, which are supposed to be the "best" silicon, then it's probably an issue that affects all 13th and 14th gen.
@Robchil I wanted fastest single thread performance, plus good multi-thread load. Never had a problem like this before with Intel, always bought desktop CPUs, back to 486s when there were no Xeons... I don't think there is such a thing as an "amateur" CPU, unless it's one I design myself , only desktop CPUs and server CPUs. I don't agree that it's okay for desktop CPUs to generate incorrect data or fail.
To evaluate your 13th or 14th generation CPU, run a stress test using Windows Subsystem for Linux and a Gentoo image. This process will individually test each core, taking roughly an hour on a 14900k processor. On my setup, faulty cores become apparent quickly. Download the specified Gentoo image from the provided link. Launch PowerShell with admin rights, modify paths to match your Windows account. Adjust WSL settings and install the package manager. Open Task Manager, select the "vmmemWSL" section, disable "All Processors," and test two CPUs at a time (e.g., 0 & 1). Inside the WSL shell, execute the compilation command with three threads. The test will either fail due to a core issue or complete successfully if the problem isn’t CPU-related. Experiment with different CPU-core combinations to identify problematic cores.
Previous findings indicate my 14900ks remains stable on single p-cores with auto motherboard configurations, unless ASUS enhancements are turned off. Restricting p-cores to the x59 model for 1 to 8 cores works well. Setting the limit to x60 causes compilation errors on GCC for cores 8 and 9, plus 10 and 11—areas that should handle boosts regardless. This implies the extra performance jump to x62 should be activated only when just one hyper-thread is active.
Besides checking each p-core individually, I'm executing a comprehensive test for all cores by rebuilding the entire package set in Gentoo. This process involves varying workloads—low, single core, multiple cores, and full core usage—to simulate real conditions. We adjust settings such as affinity to p-cores and thread counts using commands like MAKEOPTS and taskset. Running this under WSL on Windows helps too. With full-core loads, we observe power constraints (PL1 and PL2 at 253W each, ICCMAX at 307A) but still experience occasional crashes. To stabilize the build, I've lowered the CPU temperature cap to 90°C and adjusted power limits. I'm considering raising the temperature further but found stability issues at 100°C. For now, my optimal configuration remains mostly ASUS defaults with specific tweaks: disable enhancements, set p-core ratio limits, and define temperature offsets. I'm planning to try two cores with hyper-threads enabled to see if power distribution improves.
Experienced another issue during core group testing. Using four p-cores—including the preferred ones—the compiler generated an error due to illegal instructions in the output. The problematic task was: MAKEOPTS="-j9" taskset -c 8-15 emerge -1 gcc. It appears power constraints aren't resolving the problem when only a few cores are running with hyper-threading enabled. While individual two-core setups handle up to x59, the situation worsens with multiple cores active. Likely the temperature rise is the main factor, as each core becomes hotter when surrounded by others. Setting a temperature offset to 15 (85°C limit) seems effective, allowing the build to finish with p-cores 5-8. The arrangement looks like: [1 2] [3 4] [5 6] [7 8]. The hottest pairs with minimal power usage would be: 1, 3*, 4 & 5; 2, 3, 4* & 6; 3, 5*, 6 & 7; 4, 5, 6* & 8. Marked with (*) are the expected hottest cores. Four p-cores capped at 85°C consume around 130W, indicating a 253W limit won't significantly restrict performance.