By: Maynard Handley (name99.delete@this.name99.org), October 21, 2017 5:51 pm

Room: Moderated Discussions

Anon (no.delete@this.thanks.com) on October 21, 2017 4:20 pm wrote:

> Maynard Handley (name99.delete@this.name99.org) on October 20, 2017 1:34 am wrote:

> > I mentioned some weeks ago that Wolfram was going to ship a Mathematica Player for iPad,

> > and that it would be an interesting performance comparison against x86 of a "serious"

> > app. Well the Player has been released and I've spent a few hours playing with it.

> >

>

> Sorry, not clipping the rest for any other reason that readability, all quite interesting.

>

> However, I just wonder.

> What makes you think that you will ever measure anything other than 'how much does Wolfram feel

> like spending on different platforms'? I think your early measurements bear this out strongly.

>

> These systems are so high level that there is no 'porting over' between systems, as almost certainly they

> use 3rd party libraries for matrix work, etc and the quality, and even of those will vary hugely.

I think you have a drastically diminished idea of just how large and impressive Mathematica is, what it does, and how it does it, and the rest of what you say stems from that.

It is not correct to say that "These systems are so high level that ..."

What is high level is the code I write to run on Mathematica, but Mathematica itself exploits as much low-level knowledge of the machine as possible. The analogy is that my code is like Java, Mathematica is like the JVM. A JVM is not high level code; it is code that requires intimate knowledge of the CPU on which it runs.

Mathematica is vastly larger, and vastly more sophisticated than something like Matlab or Octave. This is not the place to educate you in that, but remember a few points

- it has been around (though obviously started off a lot smaller) since 1988

- it aspires to cover all of mathematics (though, again, obviously that can only be approached, not achieved)

(You can get a feel for the size here:

http://blog.wolfram.com/2016/08/08/today-we-launch-version-11/)

This means that your fundamental data structures are not something as limited as vectors and matrices. Even if you consider only that class of objects, you're dealing with arbitrary rank tensors, whose elements can each be anything from machine-precision doubles to integers to arbitrary precision floats/integers/rationals to variables to symbolic expressions. And these mixed objects can be appropriately added, contracted and suchlike.

This means in turn that there are not trivial single points in the system where you just slot in "high performance matrix multiplication" or whatever. Rather what has happened is, like I said, there are EXTREMELY generic (written by Wolfram) algorithms throughout the system and, over time, these are each specialized in an on-going fashion. Sometimes this specialization is CPU-generic (for example specializing off a fully general matrix addition routine down to a specialized routine for adding together two matrices of machine-precision doubles); other times it moves to the more CPU-specific (this would include, for example, bignum handling).

So I don't think your starting point ("Wolfram is ignoring someone or other's high performance BLAS routines") is correct; that doesn't accurately model how Mathematica works as a codebase, or how Wolfram works as a company. There are other, more subtle things going on.

For example we know that some code has been specialized to use parallel routines on the iPad, but only a limited set.

One's natural expectation would be that there's be a fairly high-level flag in the codebase that you could flip to have this kick in for all routines that have been so specialized, but we're clearly not seeing that. It seems unlikely that this is because all those routines are engaged in intricate RCU algorithms that only work on the precise x86 memory model; more likely I'm guessing is something like the flag was flipped, various things failed (someone's bug, maybe Wolfram's, maybe Apple's, maybe the compilers?) but anyway the flag was flipped back and a few experimental routines had the flag flipped back on again to test out that it works in these cases, and after we've shipped we'll figure out the generic case.

Likewise there appears to be absolutely no use of vectorization, which again suggests something like "we tried it, tests failed, we're working with XCode/LLVM to fix it, but for now it's switched off".

Wolfram appear to be taking this seriously

http://blog.wolfram.com/2017/10/04/notebooks-in-your-pocket-wolfram-player-for-ios-is-now-shipping/

and have also been frustrated at how long it has taken them. My guess is that the issue is very much the sorts of things I have described, rather than your sort of analysis.

> so you end up with one of two situations.

> Either Wolfram does, rather surprisingly, ignore existing high performance numerical

> libraries and roll their own, in which case huge and long effort will be required to

> maximise performance on each platform (and each cpu generation..), so you are just measuring

> their commitment to those platforms in the form of developer investment.

> Or Wolfram use the existing libraries, and you are measuring the relative quality of

> whichever they choose, and how well it maps to the requirements of Mathematica.

>

> this seems to me to fall into the same trap as the inclusion of encryption runs using acceleration

> (if present) in certain 'benchmarks', only even more so. It measures little of actual use.

>

> If your intention is to measure some form of relative cpu performance for anything

> other than Mathematica, I wonder what other applications you feel it would map to?

>

> Wouldnt it be much more sensible to test more controllable kernels if you were actually looking

> for some form of numerical benchmarking? However good luck with even that, it is more slippery

> than an eel, as every single application tends to have very different requirements.

>

> I suspect the most interesting part of this would be some insight

> as to what features Wolfram actually gets around to using..

> Maynard Handley (name99.delete@this.name99.org) on October 20, 2017 1:34 am wrote:

> > I mentioned some weeks ago that Wolfram was going to ship a Mathematica Player for iPad,

> > and that it would be an interesting performance comparison against x86 of a "serious"

> > app. Well the Player has been released and I've spent a few hours playing with it.

> >

>

> Sorry, not clipping the rest for any other reason that readability, all quite interesting.

>

> However, I just wonder.

> What makes you think that you will ever measure anything other than 'how much does Wolfram feel

> like spending on different platforms'? I think your early measurements bear this out strongly.

>

> These systems are so high level that there is no 'porting over' between systems, as almost certainly they

> use 3rd party libraries for matrix work, etc and the quality, and even of those will vary hugely.

I think you have a drastically diminished idea of just how large and impressive Mathematica is, what it does, and how it does it, and the rest of what you say stems from that.

It is not correct to say that "These systems are so high level that ..."

What is high level is the code I write to run on Mathematica, but Mathematica itself exploits as much low-level knowledge of the machine as possible. The analogy is that my code is like Java, Mathematica is like the JVM. A JVM is not high level code; it is code that requires intimate knowledge of the CPU on which it runs.

Mathematica is vastly larger, and vastly more sophisticated than something like Matlab or Octave. This is not the place to educate you in that, but remember a few points

- it has been around (though obviously started off a lot smaller) since 1988

- it aspires to cover all of mathematics (though, again, obviously that can only be approached, not achieved)

(You can get a feel for the size here:

http://blog.wolfram.com/2016/08/08/today-we-launch-version-11/)

This means that your fundamental data structures are not something as limited as vectors and matrices. Even if you consider only that class of objects, you're dealing with arbitrary rank tensors, whose elements can each be anything from machine-precision doubles to integers to arbitrary precision floats/integers/rationals to variables to symbolic expressions. And these mixed objects can be appropriately added, contracted and suchlike.

This means in turn that there are not trivial single points in the system where you just slot in "high performance matrix multiplication" or whatever. Rather what has happened is, like I said, there are EXTREMELY generic (written by Wolfram) algorithms throughout the system and, over time, these are each specialized in an on-going fashion. Sometimes this specialization is CPU-generic (for example specializing off a fully general matrix addition routine down to a specialized routine for adding together two matrices of machine-precision doubles); other times it moves to the more CPU-specific (this would include, for example, bignum handling).

So I don't think your starting point ("Wolfram is ignoring someone or other's high performance BLAS routines") is correct; that doesn't accurately model how Mathematica works as a codebase, or how Wolfram works as a company. There are other, more subtle things going on.

For example we know that some code has been specialized to use parallel routines on the iPad, but only a limited set.

One's natural expectation would be that there's be a fairly high-level flag in the codebase that you could flip to have this kick in for all routines that have been so specialized, but we're clearly not seeing that. It seems unlikely that this is because all those routines are engaged in intricate RCU algorithms that only work on the precise x86 memory model; more likely I'm guessing is something like the flag was flipped, various things failed (someone's bug, maybe Wolfram's, maybe Apple's, maybe the compilers?) but anyway the flag was flipped back and a few experimental routines had the flag flipped back on again to test out that it works in these cases, and after we've shipped we'll figure out the generic case.

Likewise there appears to be absolutely no use of vectorization, which again suggests something like "we tried it, tests failed, we're working with XCode/LLVM to fix it, but for now it's switched off".

Wolfram appear to be taking this seriously

http://blog.wolfram.com/2017/10/04/notebooks-in-your-pocket-wolfram-player-for-ios-is-now-shipping/

and have also been frustrated at how long it has taken them. My guess is that the issue is very much the sorts of things I have described, rather than your sort of analysis.

> so you end up with one of two situations.

> Either Wolfram does, rather surprisingly, ignore existing high performance numerical

> libraries and roll their own, in which case huge and long effort will be required to

> maximise performance on each platform (and each cpu generation..), so you are just measuring

> their commitment to those platforms in the form of developer investment.

> Or Wolfram use the existing libraries, and you are measuring the relative quality of

> whichever they choose, and how well it maps to the requirements of Mathematica.

>

> this seems to me to fall into the same trap as the inclusion of encryption runs using acceleration

> (if present) in certain 'benchmarks', only even more so. It measures little of actual use.

>

> If your intention is to measure some form of relative cpu performance for anything

> other than Mathematica, I wonder what other applications you feel it would map to?

>

> Wouldnt it be much more sensible to test more controllable kernels if you were actually looking

> for some form of numerical benchmarking? However good luck with even that, it is more slippery

> than an eel, as every single application tends to have very different requirements.

>

> I suspect the most interesting part of this would be some insight

> as to what features Wolfram actually gets around to using..

Topic | Posted By | Date |
---|---|---|

Mathematica on iPad | Maynard Handley | 2017/10/20 01:34 AM |

Mathematica on iPad | dmcq | 2017/10/20 07:26 AM |

Mathematica on iPad | Maynard Handley | 2017/10/20 01:41 PM |

Mathematica on iPad | Maynard Handley | 2017/10/20 08:16 PM |

Does this give better formatting? | Maynard Handley | 2017/10/20 08:20 PM |

Does this give better formatting? | anon | 2017/10/20 09:37 PM |

Does this give better formatting? | Maynard Handley | 2017/10/20 10:29 PM |

Does this give better formatting? | anon | 2017/10/21 12:52 AM |

Does this give better formatting? | Maynard Handley | 2017/10/21 09:48 AM |

Does this give better formatting? | anon | 2017/10/21 10:01 AM |

Mathematica on iPad | Adrian | 2017/10/21 01:49 AM |

Sorry for the typo | Adrian | 2017/10/21 01:51 AM |

Mathematica on iPad | dmcq | 2017/10/21 07:03 AM |

Mathematica on iPad | Maynard Handley | 2017/10/21 09:58 AM |

Mathematica on iPad | Wilco | 2017/10/21 07:16 AM |

Mathematica on iPad | Doug S | 2017/10/21 09:02 AM |

Mathematica on iPad | Megol | 2017/10/22 05:24 AM |

clang __builtin_addcll | Michael S | 2017/10/21 11:05 AM |

Mathematica on iPad | Maynard Handley | 2017/10/21 09:55 AM |

Mathematica on iPad | Anon | 2017/10/21 04:20 PM |

Mathematica on iPad | Maynard Handley | 2017/10/21 05:51 PM |

Mathematica on iPad | Anon | 2017/10/21 09:56 PM |

Mathematica on iPad | Maynard Handley | 2017/10/22 12:23 AM |

A quick search shows that Mathematica is using Intel MKL | Gabriele Svelto | 2017/10/21 11:38 PM |

A quick search shows that Mathematica is using Intel MKL | Anon | 2017/10/22 05:12 PM |

A quick search shows that Mathematica is using Intel MKL | Maynard Handley | 2017/10/22 06:08 PM |

A quick search shows that Mathematica is using Intel MKL | Doug S | 2017/10/22 10:40 PM |

A quick search shows that Mathematica is using Intel MKL | Michael S | 2017/10/23 05:32 AM |

Mathematica on iPad | none | 2017/10/22 06:06 AM |

Mathematica on iPad | dmcq | 2017/10/23 03:43 AM |