[Discuss] AMD FX-8120 update
Shirley Márquez Dúlcey
mark at buttery.org
Mon Mar 5 12:51:48 EST 2012
On 3/5/2012 10:02 AM, markw at mohawksoft.com wrote:
>>
>> http://www.richweb.com/cpu_info
>> If the number of cores = the number of siblings for a given physical
>> processor, then hyperthreading is OFF.
>>
>> I didn't think AMD did Hyperthreading...
>
> It doesn't.
>
>>
>> http://en.wikipedia.org/wiki/Bulldozer_%28microarchitecture%29
>> ...by eliminating some of the "redundant" elements that naturally
>> creep into multicore designs, AMD has hoped to take better advantage
>> of its hardware capabilities, while using less power.
>>
>> So does that mean it isn't just L2 cache or FPU that's being shared
>> among cores, but other more significant components of the CPU, which,
>> like Hyperthreading, are more likely to result in thread contention?
>>
>> Be interesting to see some sort of a VM benchmark compared between this
>> CPU and an Intel equivalent.
>
> I'm am by no means well versed as of yet, but my current understanding is
> that the chip has 8 full cores but they are organized as 4 pairs that
> share a math-coprocessor.
>
> I'm not entirely sure that the linux kernl fully understands the chip yet.
This page has some info (including a useful diagram) that explains the
Bulldozer architecture:
http://www.tomshardware.com/reviews/fx-8150-zambezi-bulldozer-990fx,3043-3.html
As you can see, there are four things that are shared between each core
pair: fetch, decode, the FPU, and the L2 cache. Each core gets its own
integer processing unit and L1 cache.
The shared fetch, decode, and FPU mean that there is some performance
compromise when both cores in a pair are being used, though it's much
less than the compromise of two Intel HyperThreads running concurrently.
Against that, the shared L2 cache could be an advantage if the cores are
running correlated tasks, like two threads of the same program. Finally,
if two or more core pairs are idle the processor can increase the clock
speed of the busy ones.
To get optimal performance out of Bulldozer, the OS process scheduler
will need to be aware of the architecture. For maximum performance you
generally want to spread tasks among the four core pairs, and only
double up in a pair if you have more than four things running. If you
start doubling up you probably want to pair threads of the same process
rather than different processes whenever possible.
Optimal power management would call for a different strategy, pairing up
tasks whenever possible and keeping as many of the core pairs as
possible idle. I don't know if any OS even has the hooks to change the
scheduling algorithm based on the choice of power management settings.
More information about the Discuss
mailing list