By: Ivan Godard (ivan.delete@this.millcomputing.com), April 20, 2015 5:27 pm
Room: Moderated Discussions
Ronald Maas (rmaas.delete@this.wiwo.nl) on April 19, 2015 9:53 am wrote:
>
> I agree with Ivan Godard observation that even the most advanced traditional processor core spent
> a fraction of the transistors and energy on actual useful work, calculations, data moves, etc.
>
> But I think with a different approach he would have a far better chance of success:
>
> 1) As you mentioned in your post, there is no compiler for Mill. A while ago I asked him about it in this
> forum, and he answered his team lacked bandwidth to spend much effort building a compiler (or adapting GCC/LLVM).
> If he would build a software model of the Mill and at the same time build a compiler and profiling tools needed,
> he would be able to test effectively which ideas work and which won't. There is literally tons of existing
> open source code available that can be used as input to improve the design where needed.
You seem to have mistaken limited resources for lack of interest :-) We are putting all we can into the tool chain. The software model of the Mill has been running in sim for six years now, although the Mill of six years ago is not the Mill of today.
However, I differ with your belief that profiling tools and large test runs are essential to architecture design. Yes, they are important for tuning - say any gain less than a factor of two. However, gains larger than that are visible by inspection, and insight. Of course, we have no idea whether the gain of a given feature will be 3.17X or 3.82X - but we can see that it will be somewhere between 3X and 4X. It's the same slide-rule, back-of-the-envelope, order of magnitude guesstimation that they don't teach engineering students today :-)
> 2) I believe having live profiling data is essential to better extract parallelism in existing code.
> So maybe a two pronged approach is needed. A first level compiler which translated high level language
> to some intermediate machine representation. Then (like nVidia Denver) a second level compiler running
> on the processor itself dynamically translates the intermediate representation to the native code
> that is actually executed by the hardware. There maybe many other ideas worth pursuing, but without
> iterative approach, everything Mill related will always be a shot in the dark.
That's roughly the way our existing tool chain works. However, profiling info is largely unnecessary, sometimes because the machine doesn't need it because of the way it works (for example detecting address aliasing), and sometimes because the machine does dynamic profiling in hardware (as in the Mill's self-tuning control-flow predictor).
> 3) Ivan Godard would be much better off to put all his work and ideas in public domain. Trying to establish a
> community that can help him achieve his goals without having the obligation to pay any salaries. For example
> RiscV started about 3 years ago and they already have most basic building blocks in place. We are not living
> in the 1970 anymore where a small team can successfully launch a processor like 6502. You really need 1000s of
> people and big pockets to be able to successfully launch a new ISA and be able to make some
UC Berkeley pays Prof. Patterson's salary :-)
As for my goals: PDing might get me glory or tenure, but I'm not an academic and already have all the glory I'll ever want. The Mill is a commercial product. Yes, hundreds (not thousands) of people, and deep pockets ($120M estimated) are needed. The early days of design are best done with a small team - but why assume we intend to stay that way?
> 4) Ivan Godard mentioned he wants to target a broad range of performance levels, all the way from low-end
> embedded to the high-end. There is no way anyone can compete successfully on the low-end with companies
> like Allwinner and Mediatec who are able to make a healthy profit selling quad code 64-bit SoC for 5 dollars
> a piece. Better to concentrate on high-end where achieving high IPC is going to be appreciated.
The classic entrypoint for a disruptor (which we are) is the low end of the market.
>
> I agree with Ivan Godard observation that even the most advanced traditional processor core spent
> a fraction of the transistors and energy on actual useful work, calculations, data moves, etc.
>
> But I think with a different approach he would have a far better chance of success:
>
> 1) As you mentioned in your post, there is no compiler for Mill. A while ago I asked him about it in this
> forum, and he answered his team lacked bandwidth to spend much effort building a compiler (or adapting GCC/LLVM).
> If he would build a software model of the Mill and at the same time build a compiler and profiling tools needed,
> he would be able to test effectively which ideas work and which won't. There is literally tons of existing
> open source code available that can be used as input to improve the design where needed.
You seem to have mistaken limited resources for lack of interest :-) We are putting all we can into the tool chain. The software model of the Mill has been running in sim for six years now, although the Mill of six years ago is not the Mill of today.
However, I differ with your belief that profiling tools and large test runs are essential to architecture design. Yes, they are important for tuning - say any gain less than a factor of two. However, gains larger than that are visible by inspection, and insight. Of course, we have no idea whether the gain of a given feature will be 3.17X or 3.82X - but we can see that it will be somewhere between 3X and 4X. It's the same slide-rule, back-of-the-envelope, order of magnitude guesstimation that they don't teach engineering students today :-)
> 2) I believe having live profiling data is essential to better extract parallelism in existing code.
> So maybe a two pronged approach is needed. A first level compiler which translated high level language
> to some intermediate machine representation. Then (like nVidia Denver) a second level compiler running
> on the processor itself dynamically translates the intermediate representation to the native code
> that is actually executed by the hardware. There maybe many other ideas worth pursuing, but without
> iterative approach, everything Mill related will always be a shot in the dark.
That's roughly the way our existing tool chain works. However, profiling info is largely unnecessary, sometimes because the machine doesn't need it because of the way it works (for example detecting address aliasing), and sometimes because the machine does dynamic profiling in hardware (as in the Mill's self-tuning control-flow predictor).
> 3) Ivan Godard would be much better off to put all his work and ideas in public domain. Trying to establish a
> community that can help him achieve his goals without having the obligation to pay any salaries. For example
> RiscV started about 3 years ago and they already have most basic building blocks in place. We are not living
> in the 1970 anymore where a small team can successfully launch a processor like 6502. You really need 1000s of
> people and big pockets to be able to successfully launch a new ISA and be able to make some
UC Berkeley pays Prof. Patterson's salary :-)
As for my goals: PDing might get me glory or tenure, but I'm not an academic and already have all the glory I'll ever want. The Mill is a commercial product. Yes, hundreds (not thousands) of people, and deep pockets ($120M estimated) are needed. The early days of design are best done with a small team - but why assume we intend to stay that way?
> 4) Ivan Godard mentioned he wants to target a broad range of performance levels, all the way from low-end
> embedded to the high-end. There is no way anyone can compete successfully on the low-end with companies
> like Allwinner and Mediatec who are able to make a healthy profit selling quad code 64-bit SoC for 5 dollars
> a piece. Better to concentrate on high-end where achieving high IPC is going to be appreciated.
The classic entrypoint for a disruptor (which we are) is the low end of the market.