> I did some analysis a while back that might useful to share here.
> 8.0 mm2 SKL core
> 0.9 mm2 AVX512
> 2.0 mm2 1MB L2$
> 2.4 mm2 1.375MB L3$
> 0.4 mm2 Snoop filter
> 2.0 mm2 caching and home agent
> 2.2 mm2 FIVR, PLLs
> 0.4 mm2 Mystery block
> 18.0 mm2 Total SKL-SP tile
> So AVX512 is about 5% of the tile area, the tiles are 72% of the total area of SKL-SP.
> If you removed AVX512 you'd save 28mm2 for the whole chip, which would let you add at most 2 tiles.
> In that vein, it seems like a pretty reasonable trade-off.

I won't say any processor design choice is not reasonable because we don't know what the constraints are, even apparently stupid marketing segregation that is done with very reasonable goal of increasing profit!

That said, how do you separate core (presumably including AVX256) from AVX512? Is AVX512 part of the core 8mm, or additional? Do you just roughly chop vector units and registers in half?

Putting 512 bit data paths through the L1d to vector units is at least one thing in the core which is not a simple bolt-on, and would have significant affects outside the Zen2 today with AVX256 and 1/2 the L1 bandwidth appears to get significantly better IPC per core on SPECfp than Skylake SP. Zen 1 core on similar/worse process and vintage seems to get comparable IPC (although lower perf due to MHz of course).

SPECfp != all floating point workloads and you can obviously find codes where AVX512 and presumably the double L1 bandwidth does help. Still, I suspect many would prefer a few more cores.
