Shared FPU wasn't BD's problem

By: Anon (, August 31, 2021 12:28 am
Room: Moderated Discussions
Chester ( on August 30, 2021 1:03 pm wrote:
> In BD's case, that's probably to hit high clock speeds on a pretty bad node. Integer SIMD
> ops are 1c latency on newer AMD CPUs, but probably 2c in Bulldozer because the units are
> half width. Piledriver could do a couple FPU ops (extrq, insertq) with 1c latency.

They were 2 cycles because of the FPU design, from Athlon to Bulldozer all AMD FPUs (by FPU I mean: integer SIMD units) had a minimum 2 cycle latency.

> I think Bulldozer's biggest problems were:
> - The 16 KB L1D was too small and write-through
> - Slow L2 has to handle a lot of L1D misses
> - The branch predictor was better than K10's, but not quite as good as Intel's at the time
> - Each module half (thread) just wasn't as beefy as a whole Intel core, which could
> bring a lot more OOO resources into play when one SMT thread is in halt.
> - FP execution units were 128 bits wide (256-bit AVX ops decoded into two micro-ops),
> putting it at a disadvantage vs Sandy Bridge's 256-bit wide units
> Then to wrap it up, every single bit of ST performance matters for the desktop
> market. Sharing the FPU is pretty far down on the list of BD's problems, IMO.

Let's go into this discussion again, BD was a bad CPU, I think everybody agree here, the problem starts when people try to find "why" and then they point everything that was different in BD as a "bad choice", but hey, BD wasn't bad in every aspect, it's modulo multi-threaded performance was on par with Intel HT core, just the single-threaded performance (which was very important specially on the consumer market) was trrible, so please guys, stop blaming the shared resources (write-trough L2, FPU, decoder, L1I), they were fine at delivering about the same performance of each Intel HT thread, the problem was the non-shared resources which were way too limited for good single-thread performance.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
AVX512 as co-processorMichael S2021/08/29 03:13 AM
  AVX512 as co-processor-.-2021/08/29 04:05 AM
    Shared FPU wasn't BD's problemChester2021/08/30 01:03 PM
      Excellent post (NT)Heikki Kultala2021/08/30 01:34 PM
      Shared FPU wasn't BD's problemP Snip2021/08/30 01:53 PM
      Shared FPU wasn't BD's problem-.-2021/08/30 05:47 PM
      Shared FPU wasn't BD's problemDavid Kanter2021/08/30 10:29 PM
        Shared FPU wasn't BD's problemChester2021/08/31 02:58 AM
          Shared FPU wasn't BD's problemDavid Kanter2021/08/31 09:28 AM
            Shared FPU wasn't BD's problemChester2021/08/31 12:29 PM
            Shared FPU wasn't BD's problemRayla2021/08/31 02:34 PM
      Shared FPU wasn't BD's problemAnon2021/08/31 12:28 AM
        Shared FPU wasn't BD's problemAdrian2021/08/31 01:27 AM
          Shared FPU wasn't BD's problemAnon2021/08/31 02:06 AM
            Shared FPU wasn't BD's problemanonymou52021/08/31 02:09 PM
              Shared FPU wasn't BD's problemChester2021/09/01 11:05 AM
      Shared FPU wasn't BD's problemKevin G2021/08/31 09:39 AM
        Shared FPU wasn't BD's problemChester2021/09/01 10:03 AM
Reply to this Topic
Body: No Text
How do you spell tangerine? 🍊