By: Maynard Handley (name99.delete@this.name99.org), October 22, 2017 12:23 am

Room: Moderated Discussions

Anon (no.delete@this.email.com) on October 21, 2017 9:56 pm wrote:

> Maynard Handley (name99.delete@this.name99.org) on October 21, 2017 5:51 pm wrote:

> > Anon (no.delete@this.thanks.com) on October 21, 2017 4:20 pm wrote:

> > > Maynard Handley (name99.delete@this.name99.org) on October 20, 2017 1:34 am wrote:

> > > > I mentioned some weeks ago that Wolfram was going to ship a Mathematica Player for iPad,

> > > > and that it would be an interesting performance comparison against x86 of a "serious"

> > > > app. Well the Player has been released and I've spent a few hours playing with it.

> > > >

> > >

> > > Sorry, not clipping the rest for any other reason that readability, all quite interesting.

> > >

> > > However, I just wonder.

> > > What makes you think that you will ever measure anything other than 'how much does Wolfram feel

> > > like spending on different platforms'? I think your early measurements bear this out strongly.

> > >

> > > These systems are so high level that there is no 'porting over' between systems, as almost certainly they

> > > use 3rd party libraries for matrix work, etc and the quality, and even of those will vary hugely.

> >

> > I think you have a drastically diminished idea of just how large and impressive Mathematica

> > is, what it does, and how it does it, and the rest of what you say stems from that.

> >

> > It is not correct to say that "These systems are so high level that ..."

> > What is high level is the code I write to run on Mathematica,

> > but Mathematica itself exploits as much low-level

> > knowledge of the machine as possible. The analogy is that

> > my code is like Java, Mathematica is like the JVM.

> > A JVM is not high level code; it is code that requires intimate knowledge of the CPU on which it runs.

>

> Thats funny, having used Mathematica since, from memory 1989 - I am thinking somewhere in the

> 1.x series, I dont think you are even close to the mark, but what would I know apparently.

I don't understand what you are arguing for, or why you are arguing the way you are.

Half the things you are saying seem to be things I have already agreed with and stated in my earlier posts.

You seem to be more interested in picking a fight than anything else, and I'm not interested in that.

I investigated this to learn things *I* was interested in, and shared the results because I thought others might be interested (in the same way that I am always appreciative when I see others share their data). I have guesses as to why the results turned out the way they did, but nothing more than guesses. They're not unmotivated guesses (I've probably read many more pages than you of Wolfram change notes, Wolfram blog postings, articles on how Mathematica works internally, etc) but they are, and I have said this repeatedly, just guesses.

I don't understand the variety of ways in which you seem to be angry with me for doing this, and for making these guesses, but, like I said, I'm not interested in fighting.

I've given you my analysis; if you want to disagree, well, there we are.

I managed to find out SOME things about what Mathematic is using:

- there is this page

http://www.wolfram.com/LGPL/GMP/

which says they've been using GMP since version 6.

(They describe, in the appendix of The Mathematica book v4, and in some Tech Journal articles, how they rolled their own code before using GMP.)

- there is this page

https://mathematica.stackexchange.com/questions/41949/how-to-determine-blas-lapack-implementation-used-internally-for-numerical-matrix

which says that on Intel they use MKL.

- this page suggests that they (maybe) once used ATLAS

https://reference.wolfram.com/legacy/v5_2/Built-inFunctions/AdvancedDocumentation/LinearAlgebra/LinearAlgebraInMathematica/Appendix/AdvancedDocumentationLinearAlgebra7.0.html,

and the page I linked to above implies that's what they still use on rPi (so all ARM?)

Again this was a switch to BLAS/LAPACK, around version 5 (with the introduction of packed arrays), with their own routines before then.

- this page

http://library.wolfram.com/infocenter/Conferences/4853/DevCon-2003.nb?file_id=4582

suggests that they use the BLAS/LAPACK APIs, and so whatever is available, In particular it says that on Macs (this was PPC days) they use Apple's Accelerate library, that it is vectorized, and that it is multithreaded.

That would SEEM to imply that they'd use the equivalent Accelerate library (which certainly exists and is very performant, and absolutely uses both vectors and multiple cores) on iPad.

But for whatever reason, they do not seem to be doing so.

Or is it the case that Accelerate's BLAS/LAPACK

Now are they using GMP? We can't be sure, but again I think not.

Look at

https://gmplib.org/gmpbench.html

in particular compare the best available score/GHz

(skylake, 1422) with the best ARM64 core in the list (X-Gene 1, 598).

That's a 3x difference, throw in the x86 vs A10X frequency ratio and you're at worst case .45x difference. But let's be honest, the A10 is going to be better than that XGene --- we don't know how much but would you be surprised if it's 2x better?

Point is --- even if we assume worst case scenario, by using GMP code (which --- I looked at the code base --- has a TON of x86-specific assembly and just a few fragments of AArch64 assembly) we get about a 4.5x ratio, not the 13x+ ratio I'm seeing.

I appreciate your point that Wolfram can be using external libraries, and that these libraries may not be nearly as optimized for AArch64 as for x86 --- I made the exact same points in some of my initial postings. But I DON'T see that as especially relevant to the particular discrepancies I am seeing here.

The discrepancies are just too large between what should be happening if the appropriate libraries were used (Apple Accelerate, GMP), and what we are actually seeing.

Which is why I offered up the suggestions I did for what is going wrong. If you have a better suggestion, I'm all ears.

> Maynard Handley (name99.delete@this.name99.org) on October 21, 2017 5:51 pm wrote:

> > Anon (no.delete@this.thanks.com) on October 21, 2017 4:20 pm wrote:

> > > Maynard Handley (name99.delete@this.name99.org) on October 20, 2017 1:34 am wrote:

> > > > I mentioned some weeks ago that Wolfram was going to ship a Mathematica Player for iPad,

> > > > and that it would be an interesting performance comparison against x86 of a "serious"

> > > > app. Well the Player has been released and I've spent a few hours playing with it.

> > > >

> > >

> > > Sorry, not clipping the rest for any other reason that readability, all quite interesting.

> > >

> > > However, I just wonder.

> > > What makes you think that you will ever measure anything other than 'how much does Wolfram feel

> > > like spending on different platforms'? I think your early measurements bear this out strongly.

> > >

> > > These systems are so high level that there is no 'porting over' between systems, as almost certainly they

> > > use 3rd party libraries for matrix work, etc and the quality, and even of those will vary hugely.

> >

> > I think you have a drastically diminished idea of just how large and impressive Mathematica

> > is, what it does, and how it does it, and the rest of what you say stems from that.

> >

> > It is not correct to say that "These systems are so high level that ..."

> > What is high level is the code I write to run on Mathematica,

> > but Mathematica itself exploits as much low-level

> > knowledge of the machine as possible. The analogy is that

> > my code is like Java, Mathematica is like the JVM.

> > A JVM is not high level code; it is code that requires intimate knowledge of the CPU on which it runs.

>

> Thats funny, having used Mathematica since, from memory 1989 - I am thinking somewhere in the

> 1.x series, I dont think you are even close to the mark, but what would I know apparently.

I don't understand what you are arguing for, or why you are arguing the way you are.

Half the things you are saying seem to be things I have already agreed with and stated in my earlier posts.

You seem to be more interested in picking a fight than anything else, and I'm not interested in that.

I investigated this to learn things *I* was interested in, and shared the results because I thought others might be interested (in the same way that I am always appreciative when I see others share their data). I have guesses as to why the results turned out the way they did, but nothing more than guesses. They're not unmotivated guesses (I've probably read many more pages than you of Wolfram change notes, Wolfram blog postings, articles on how Mathematica works internally, etc) but they are, and I have said this repeatedly, just guesses.

I don't understand the variety of ways in which you seem to be angry with me for doing this, and for making these guesses, but, like I said, I'm not interested in fighting.

I've given you my analysis; if you want to disagree, well, there we are.

I managed to find out SOME things about what Mathematic is using:

- there is this page

http://www.wolfram.com/LGPL/GMP/

which says they've been using GMP since version 6.

(They describe, in the appendix of The Mathematica book v4, and in some Tech Journal articles, how they rolled their own code before using GMP.)

- there is this page

https://mathematica.stackexchange.com/questions/41949/how-to-determine-blas-lapack-implementation-used-internally-for-numerical-matrix

which says that on Intel they use MKL.

- this page suggests that they (maybe) once used ATLAS

https://reference.wolfram.com/legacy/v5_2/Built-inFunctions/AdvancedDocumentation/LinearAlgebra/LinearAlgebraInMathematica/Appendix/AdvancedDocumentationLinearAlgebra7.0.html,

and the page I linked to above implies that's what they still use on rPi (so all ARM?)

Again this was a switch to BLAS/LAPACK, around version 5 (with the introduction of packed arrays), with their own routines before then.

- this page

http://library.wolfram.com/infocenter/Conferences/4853/DevCon-2003.nb?file_id=4582

suggests that they use the BLAS/LAPACK APIs, and so whatever is available, In particular it says that on Macs (this was PPC days) they use Apple's Accelerate library, that it is vectorized, and that it is multithreaded.

That would SEEM to imply that they'd use the equivalent Accelerate library (which certainly exists and is very performant, and absolutely uses both vectors and multiple cores) on iPad.

But for whatever reason, they do not seem to be doing so.

Or is it the case that Accelerate's BLAS/LAPACK

Now are they using GMP? We can't be sure, but again I think not.

Look at

https://gmplib.org/gmpbench.html

in particular compare the best available score/GHz

(skylake, 1422) with the best ARM64 core in the list (X-Gene 1, 598).

That's a 3x difference, throw in the x86 vs A10X frequency ratio and you're at worst case .45x difference. But let's be honest, the A10 is going to be better than that XGene --- we don't know how much but would you be surprised if it's 2x better?

Point is --- even if we assume worst case scenario, by using GMP code (which --- I looked at the code base --- has a TON of x86-specific assembly and just a few fragments of AArch64 assembly) we get about a 4.5x ratio, not the 13x+ ratio I'm seeing.

I appreciate your point that Wolfram can be using external libraries, and that these libraries may not be nearly as optimized for AArch64 as for x86 --- I made the exact same points in some of my initial postings. But I DON'T see that as especially relevant to the particular discrepancies I am seeing here.

The discrepancies are just too large between what should be happening if the appropriate libraries were used (Apple Accelerate, GMP), and what we are actually seeing.

Which is why I offered up the suggestions I did for what is going wrong. If you have a better suggestion, I'm all ears.

Topic | Posted By | Date |
---|---|---|

Mathematica on iPad | Maynard Handley | 2017/10/20 01:34 AM |

Mathematica on iPad | dmcq | 2017/10/20 07:26 AM |

Mathematica on iPad | Maynard Handley | 2017/10/20 01:41 PM |

Mathematica on iPad | Maynard Handley | 2017/10/20 08:16 PM |

Does this give better formatting? | Maynard Handley | 2017/10/20 08:20 PM |

Does this give better formatting? | anon | 2017/10/20 09:37 PM |

Does this give better formatting? | Maynard Handley | 2017/10/20 10:29 PM |

Does this give better formatting? | anon | 2017/10/21 12:52 AM |

Does this give better formatting? | Maynard Handley | 2017/10/21 09:48 AM |

Does this give better formatting? | anon | 2017/10/21 10:01 AM |

Mathematica on iPad | Adrian | 2017/10/21 01:49 AM |

Sorry for the typo | Adrian | 2017/10/21 01:51 AM |

Mathematica on iPad | dmcq | 2017/10/21 07:03 AM |

Mathematica on iPad | Maynard Handley | 2017/10/21 09:58 AM |

Mathematica on iPad | Wilco | 2017/10/21 07:16 AM |

Mathematica on iPad | Doug S | 2017/10/21 09:02 AM |

Mathematica on iPad | Megol | 2017/10/22 05:24 AM |

clang __builtin_addcll | Michael S | 2017/10/21 11:05 AM |

Mathematica on iPad | Maynard Handley | 2017/10/21 09:55 AM |

Mathematica on iPad | Anon | 2017/10/21 04:20 PM |

Mathematica on iPad | Maynard Handley | 2017/10/21 05:51 PM |

Mathematica on iPad | Anon | 2017/10/21 09:56 PM |

Mathematica on iPad | Maynard Handley | 2017/10/22 12:23 AM |

A quick search shows that Mathematica is using Intel MKL | Gabriele Svelto | 2017/10/21 11:38 PM |

A quick search shows that Mathematica is using Intel MKL | Anon | 2017/10/22 05:12 PM |

A quick search shows that Mathematica is using Intel MKL | Maynard Handley | 2017/10/22 06:08 PM |

A quick search shows that Mathematica is using Intel MKL | Doug S | 2017/10/22 10:40 PM |

A quick search shows that Mathematica is using Intel MKL | Michael S | 2017/10/23 05:32 AM |

Mathematica on iPad | none | 2017/10/22 06:06 AM |

Mathematica on iPad | dmcq | 2017/10/23 03:43 AM |