By: Jouni Osmala (josmala.delete@this.cc.hut.fi), January 13, 2015 4:38 am
Room: Moderated Discussions
coppice (coppice.delete@this.dis.org) on January 12, 2015 10:01 pm wrote:
> Jouni Osmala (josmala.delete@this.cc.hut.fi) on January 11, 2015 4:50 am wrote:
> > The Book definition of Many-core is several tens to
> > hundreds of cores. And xeon-phi fits the description of several tens of cores. The core doesn't
> > really need to be designed specifically for many-core to fit definition of many-core.
> >
> > While people often seem to think many-core and OoO are mutually exclusive, I consider that
> > less aggressive OoO with reasonable ISA maybe optimal for many-core. Each core needs reasonable
> > amount of cache to not overload the communication network for data traffic.
> > We both know if some-one would make a 100 core processor today for server you
> > can fit OoO logic to power budget, as long as it isn't too aggressive.
>
> Book with a capital B? You mean the Bible defines manycore?
No. It just means English as first foreign language didn't really teach that kind of details of English language in Finland.
http://books.google.fi/books?id=pSxa_anfiG0C&pg=PA3&dq=several+tens&redir_esc=y#v=onepage&q=several%20tens&f=false
> There is no well accepted definition of manycore. Personally, I think its really odd to call
> xeon-phi many core. If its core were designed to put many on a chip it would have been much
> simpler. For better or worse it was designed very much as a middle ground, distinct from the
> big fat multicore world, and the world of putting as many simple cores on a chip as possible.
> Its arithmetic is wider than that in Intel's multicore devices. That is definitely not a strategy
> for getting large numbers of cores on a die. Its taking things in another direction.
Wide vector is good for throughput and its throughput oriented design. There is point beyond which simpler means just too many cache's, and total domination of die area by communication and local storage. Its balancing act between single threaded performance and number of cores, while current latency oriented design's from Intel has gone too far from optimum in one direction going too far in other direction will also give diminishing results. Demanding that anything called many core would need to eliminate every performance feature would be absurd since that would simply mean that communication would dominate, and anything called many core wouldn't exist simply because that image is nothing other than pure straw man. Lets consider OoO logic for example as one people assume many core shouldn't have. When you can have thousand comparators working in modern desktop OoO scheduler you can still do OoO scheduler with tens of comparators and both can issue instructions in OoO fashion and get some of the benefits from under order of magnitude of cost.
> Jouni Osmala (josmala.delete@this.cc.hut.fi) on January 11, 2015 4:50 am wrote:
> > The Book definition of Many-core is several tens to
> > hundreds of cores. And xeon-phi fits the description of several tens of cores. The core doesn't
> > really need to be designed specifically for many-core to fit definition of many-core.
> >
> > While people often seem to think many-core and OoO are mutually exclusive, I consider that
> > less aggressive OoO with reasonable ISA maybe optimal for many-core. Each core needs reasonable
> > amount of cache to not overload the communication network for data traffic.
> > We both know if some-one would make a 100 core processor today for server you
> > can fit OoO logic to power budget, as long as it isn't too aggressive.
>
> Book with a capital B? You mean the Bible defines manycore?
No. It just means English as first foreign language didn't really teach that kind of details of English language in Finland.
http://books.google.fi/books?id=pSxa_anfiG0C&pg=PA3&dq=several+tens&redir_esc=y#v=onepage&q=several%20tens&f=false
> There is no well accepted definition of manycore. Personally, I think its really odd to call
> xeon-phi many core. If its core were designed to put many on a chip it would have been much
> simpler. For better or worse it was designed very much as a middle ground, distinct from the
> big fat multicore world, and the world of putting as many simple cores on a chip as possible.
> Its arithmetic is wider than that in Intel's multicore devices. That is definitely not a strategy
> for getting large numbers of cores on a die. Its taking things in another direction.
Wide vector is good for throughput and its throughput oriented design. There is point beyond which simpler means just too many cache's, and total domination of die area by communication and local storage. Its balancing act between single threaded performance and number of cores, while current latency oriented design's from Intel has gone too far from optimum in one direction going too far in other direction will also give diminishing results. Demanding that anything called many core would need to eliminate every performance feature would be absurd since that would simply mean that communication would dominate, and anything called many core wouldn't exist simply because that image is nothing other than pure straw man. Lets consider OoO logic for example as one people assume many core shouldn't have. When you can have thousand comparators working in modern desktop OoO scheduler you can still do OoO scheduler with tens of comparators and both can issue instructions in OoO fashion and get some of the benefits from under order of magnitude of cost.