Engineering Roundtable I: Newisys’ HORUS Chipset

Pages: 1 2 3 4 5 6 7 8 9 10

HORUS Configuration

Page 22 of the Hot Chips HORUS slides states, “[U]sing HORUS and IB cables…” – what are these IB cables? Infiniband?

Rajesh Kota:
Yes. HORUS uses SerDes physicals that are IB compliant and we also use the IB cables for the interconnection between boxes. But we don’t run IB protocols; we have our own version of extended coherent HT protocol that we run on top of it.

David Wang:
In regards to my “belt and suspender” comment earlier, I was thinking that since HORUS supports cache coherency of local nodes through it, perhaps you can have a low cost HORUS that glues together lower cost 2xx series Opteron and enable cheap larger scale MP boxes that are still “cache coherent”, although the cost of cache coherency would be proportionally greater. Is this correct?

Rajesh Kota:
See the right side of page 5 of my hot chips presentation. It is possible to configure HORUS to create a larger SMP system using 2xx series Opterons. The blade configuration shows on the right side of page 5 shows an example of its implementation.

David Wang:
Would it be difficult to extend the design to larger scale ccMP boxes? 128/256P?

Rajesh Kota:
Yes. The difficulty is more in feasibility and implementation issues to keep the latency down and provide sufficient bandwidth. It will require more links, faster links, bigger RDC and DIR, because as you add more quads the directory holds less data (relatively speaking).

David Kanter:
Do you think it will be eventually feasible to put the RDC on-die with HORUS? Any guesses as to what process node this might be at?

Rajesh Kota:
You cannot put 64MB of data on a 130nm process. I don’t think anybody supports 1T memory cells for TSMC’s 130nm logic process, except possibly MoSys. I believe IBM has 1T memory cells, but I don’t know how much you can stick on die. For us, simply getting the tags on-die is a big challenge. It takes up the majority of our die space. Unfortunately, I can’t comment on future process nodes. But, I believe Intel’s new version of Itanium supports ~27MB of cache on 90nm

David Wang:
General comment: Montecito is up to 24MB L3 at the 90nm node…Maybe at the 45nm node putting the RDC on die would be feasible? But keeping it off die would probably be “better.” That is, hitting an on die RDC may be 50ns, and an off-die RDC might be 70ns. Paying the die cost for lowering the latency in that way may not be economical.

Does Newisys have a plan to market the HORUS chip itself?

Rajesh Kota:
I think right now our business approach has been to sell complete system solutions to OEMs. I don’t know what approach marketing will take with the HORUS chip.

Pages: « Prev   1 2 3 4 5 6 7 8 9 10   Next »

Be the first to discuss this article!