Article: Parallelism at HotPar 2010
By: Michael S (already5chosen.delete@this.yahoo.com), August 8, 2010 5:14 am
Room: Moderated Discussions
Michael S (already5chosen@yahoo.com) on 8/8/10 wrote:
---------------------------
>Carlie Coats (coats@baronams.com) on 8/8/10 wrote:
>---------------------------
>>
>>Or are you saying that it's the network-cards themselves?
>>
>
>Exactly
>
Another possible explanation is MPI communication stack based on Programmable I/O (in transmit direction, received better left on DMA) + status polling rather than on DMA in both directionds+interrupts.
On Nehalem, due to SMT, wisely scheduled PIO/polling could be rather cheap. On non-SMT processors like C2D/CDQ or K8/K10 PIO/polling is always expensive.
---------------------------
>Carlie Coats (coats@baronams.com) on 8/8/10 wrote:
>---------------------------
>>
>>Or are you saying that it's the network-cards themselves?
>>
>
>Exactly
>
Another possible explanation is MPI communication stack based on Programmable I/O (in transmit direction, received better left on DMA) + status polling rather than on DMA in both directionds+interrupts.
On Nehalem, due to SMT, wisely scheduled PIO/polling could be rather cheap. On non-SMT processors like C2D/CDQ or K8/K10 PIO/polling is always expensive.