By: Patrick Chase (patrickjchase.delete@this.gmail.com), October 1, 2015 11:39 pm
Room: Moderated Discussions
I.S.T. (objectorbit.delete@this.yahoo.com) on October 1, 2015 6:10 pm wrote:
> Patrick Chase (patrickjchase.delete@this.gmail.com) on October 1, 2015 5:38 pm wrote:
> > In a previous job I implemented loads that were about as "compute intensive" as they
> > come on a 4-wide machine that could do 1 L/S per clock. It has 64 architectural regs
> > (renaming doesn't help you here), which is about as good as it gets in terms of
> > avoiding excess memory ops due to spills/fills. Even so almost every nontrivial
> > workload ended up constrained by L/S bandwidth, and I ended up spending a lot of time
> > implementing in-register blocking schemes etc.
> >
> > Gabriele has similar horror stories from working with the same architecture.
>
> What arch was this?
ST2xx. It's a VLIW, which is why I used the term "architecture" when describing things that are normally considered to be part of the uarch.
> Patrick Chase (patrickjchase.delete@this.gmail.com) on October 1, 2015 5:38 pm wrote:
> > In a previous job I implemented loads that were about as "compute intensive" as they
> > come on a 4-wide machine that could do 1 L/S per clock. It has 64 architectural regs
> > (renaming doesn't help you here), which is about as good as it gets in terms of
> > avoiding excess memory ops due to spills/fills. Even so almost every nontrivial
> > workload ended up constrained by L/S bandwidth, and I ended up spending a lot of time
> > implementing in-register blocking schemes etc.
> >
> > Gabriele has similar horror stories from working with the same architecture.
>
> What arch was this?
ST2xx. It's a VLIW, which is why I used the term "architecture" when describing things that are normally considered to be part of the uarch.