Why do most CPUs only have 2 threads per core?

F5F Stay Refreshed Hardware Desktop Why do most CPUs only have 2 threads per core?

Why do most CPUs only have 2 threads per core?

Why do most CPUs only have 2 threads per core?

Pages (2): Previous 1 2

W

WhatsThePack
Member

215

01-27-2025, 04:30 AM

#11

Optimization levels aren't always clear, especially for instruction sets. x86 offers a larger pool of instructions but processes them more slowly one by one. PowerPC and its successor Power ISA use fewer instructions, yet execute them quicker. The benefit of a smaller instruction set versus the speed of individual operations depends on the specific task. Today's compilers usually make x86 and RISC architectures perform comparably well, minimizing performance differences beyond typical scenarios.

W

WhatsThePack

01-27-2025, 04:30 AM #11

Optimization levels aren't always clear, especially for instruction sets. x86 offers a larger pool of instructions but processes them more slowly one by one. PowerPC and its successor Power ISA use fewer instructions, yet execute them quicker. The benefit of a smaller instruction set versus the speed of individual operations depends on the specific task. Today's compilers usually make x86 and RISC architectures perform comparably well, minimizing performance differences beyond typical scenarios.

_

_BadoTommeh_
Member

50

01-27-2025, 12:27 PM

#12

Previously, components relied on coprocessors with a PCI-E interface for system connection, whereas modern units function like standard CPUs and fit into the LGA3647 socket.

_

_BadoTommeh_

01-27-2025, 12:27 PM #12

Previously, components relied on coprocessors with a PCI-E interface for system connection, whereas modern units function like standard CPUs and fit into the LGA3647 socket.

K

KiLLOfTheEnd
Junior Member

7

01-27-2025, 06:42 PM

#13

I believe the solution becomes clear once you understand how SMT functions. Opposite of common thought, SMT doesn't function like a system with two separate cores. A processor contains various components. For simplicity, picture a processor with distinct "addition" and "subtraction" modules. If two threads require both operations, SMT won't perform effectively—you'd lose efficiency. In fact, it could even reduce overall performance due to complexity and power consumption. However, if one thread needs addition and another subtraction, an SMT-enabled core can handle both tasks at once on a single core. But efficiency drops when all execution units can't be utilized consistently. Intel likely decided not to push beyond two threads per core because they can maintain reliable operation across all units. Exceeding that number increases chip size, raises heat output, and strains the cache, potentially hurting performance further. Intel probably prioritizes cost-effectiveness over universal optimization, which is why Xeon Phi adopted a four-way SMT approach. It was built for parallel tasks, not everyday computing needs. This design choice avoided unnecessary expense and maximized practical use cases.

K

KiLLOfTheEnd

01-27-2025, 06:42 PM #13

I believe the solution becomes clear once you understand how SMT functions. Opposite of common thought, SMT doesn't function like a system with two separate cores. A processor contains various components. For simplicity, picture a processor with distinct "addition" and "subtraction" modules. If two threads require both operations, SMT won't perform effectively—you'd lose efficiency. In fact, it could even reduce overall performance due to complexity and power consumption. However, if one thread needs addition and another subtraction, an SMT-enabled core can handle both tasks at once on a single core. But efficiency drops when all execution units can't be utilized consistently. Intel likely decided not to push beyond two threads per core because they can maintain reliable operation across all units. Exceeding that number increases chip size, raises heat output, and strains the cache, potentially hurting performance further. Intel probably prioritizes cost-effectiveness over universal optimization, which is why Xeon Phi adopted a four-way SMT approach. It was built for parallel tasks, not everyday computing needs. This design choice avoided unnecessary expense and maximized practical use cases.

S

ShadoVNZL
Member

58

01-31-2025, 06:46 AM

#14

Threads are not physical cores themselves. They mainly provide jobs to the cores. In some tasks, using more threads offers a noticeable boost, but for most everyday applications, having more physical cores generally allows more work to be completed than fewer cores with many threads. Scientific and research tasks that require handling numerous simultaneous data points—such as protein folding or chemical effects on DNA—can run efficiently on CPUs like Intel's Xeon Phi 7230 with 64 cores or 256 threads due to their workload characteristics. Your desktop performs better when it has more real cores suited to its workload, or when it uses faster cores for gaming. The optimal choice depends on the type of task at hand.

S

ShadoVNZL

01-31-2025, 06:46 AM #14

Threads are not physical cores themselves. They mainly provide jobs to the cores. In some tasks, using more threads offers a noticeable boost, but for most everyday applications, having more physical cores generally allows more work to be completed than fewer cores with many threads. Scientific and research tasks that require handling numerous simultaneous data points—such as protein folding or chemical effects on DNA—can run efficiently on CPUs like Intel's Xeon Phi 7230 with 64 cores or 256 threads due to their workload characteristics. Your desktop performs better when it has more real cores suited to its workload, or when it uses faster cores for gaming. The optimal choice depends on the type of task at hand.

H

HenrikEV
Member

60

02-14-2025, 11:17 PM

#15

Power (POWER8 in special) and SPARC are the main ISAs with STM8 (supporting 8 threads per core). Increasing threads often means adding more execution units, which can waste space and resources for typical user needs. @LAwLz shared a clear explanation on why this happens.

H

HenrikEV

02-14-2025, 11:17 PM #15

Power (POWER8 in special) and SPARC are the main ISAs with STM8 (supporting 8 threads per core). Increasing threads often means adding more execution units, which can waste space and resources for typical user needs. @LAwLz shared a clear explanation on why this happens.

E

eyeballs100
Junior Member

10

02-15-2025, 01:34 AM

#16

It's worth mentioning that the POWER10 chips are likely built to run IBM software. It wouldn't be surprising if IBM's programs were key in shaping the POWER10's development and design. Right now, IBM is heavily concentrating on storage and AI. Would you think the main strengths of POWER10 are improved I/O architecture and better performance for AI tasks? That sounds more accurate than the idea that Windows would be optimized for two threads per core.

E

eyeballs100

02-15-2025, 01:34 AM #16

It's worth mentioning that the POWER10 chips are likely built to run IBM software. It wouldn't be surprising if IBM's programs were key in shaping the POWER10's development and design. Right now, IBM is heavily concentrating on storage and AI. Would you think the main strengths of POWER10 are improved I/O architecture and better performance for AI tasks? That sounds more accurate than the idea that Windows would be optimized for two threads per core.

P

PatTheTrooper
Junior Member

2

02-15-2025, 09:44 AM

#17

That approach mirrors how CELL operated too. Picture the PPE as the "core" and the SPES as the "threads." Now envision the immense capability you'd achieve by integrating 8 or more of these "cores" onto a single chip—IBM was genuinely ahead of its time. (Maybe I'm mistaken, though!)

P

PatTheTrooper

02-15-2025, 09:44 AM #17

That approach mirrors how CELL operated too. Picture the PPE as the "core" and the SPES as the "threads." Now envision the immense capability you'd achieve by integrating 8 or more of these "cores" onto a single chip—IBM was genuinely ahead of its time. (Maybe I'm mistaken, though!)

S

sullycraft17
Junior Member

29

02-15-2025, 01:12 PM

#18

The SPE units in CELL function more like a SIMD processor without proper branching, yet they can handle massive data operations at once—essentially a weak GPU implementation. Sony intended to use it as the primary graphics driver, but after experiencing its shortcomings, they turned to NVIDIA for a better GPU. In the end, it worked somewhat and provided some improvement, though the actual GPU itself wasn’t very strong.

S

sullycraft17

02-15-2025, 01:12 PM #18

The SPE units in CELL function more like a SIMD processor without proper branching, yet they can handle massive data operations at once—essentially a weak GPU implementation. Sony intended to use it as the primary graphics driver, but after experiencing its shortcomings, they turned to NVIDIA for a better GPU. In the end, it worked somewhat and provided some improvement, though the actual GPU itself wasn’t very strong.

Pages (2): Previous 1 2