By: aaron spink (aaronspink.delete@this.notearthlink.net), July 30, 2012 3:03 am
Room: Moderated Discussions
none (none.delete@this.none.com) on July 29, 2012 9:05 am wrote:
> It depends on
> what you call "simple". daxpy requires 2 LD / 1 ST for 1 FMA. So one C2050
> being 515 G FMA/s according to Wikipedia, I'd say it's memory bandwidth limited
> on daxpy.
>
That would be the most naive linpack implementation ever. With proper data structures you are at much less than 1B/flop.
> It depends on
> what you call "simple". daxpy requires 2 LD / 1 ST for 1 FMA. So one C2050
> being 515 G FMA/s according to Wikipedia, I'd say it's memory bandwidth limited
> on daxpy.
>
That would be the most naive linpack implementation ever. With proper data structures you are at much less than 1B/flop.



