By: Maynard Handley (name99.delete@this.name99.org), October 20, 2017 1:34 am

Room: Moderated Discussions

I mentioned some weeks ago that Wolfram was going to ship a Mathematica Player for iPad, and that it would be an interesting performance comparison against x86 of a "serious" app. Well the Player has been released and I've spent a few hours playing with it.

It's a substantial hassle to do serious benchmarking comparisons between the devices because the Player is firstly not an editable environment (you have to wrap everything you want to be timed inside some sort of Manipulate[], and every change requires a cycle of save on the Mac, delete on the iPad, reload on the iPad), secondly Wolfram have made the decision (for whatever reason? or it's a bug? or it's an iOS power-saving thing?) that any calculation that takes longer than around 5 seconds goes into some sort of neverland where the UI doesn't tell you what's happened, but the calculation is never going to complete.

This is turn means that it's difficult to create a reasonable pool of items to test all at once, you have to step your way trying each thing at a time and feeling out how large a problem can grow before it dies.

Even so, one can learn some interesting things in just a few hours, with results that can be interpreted as meaning pretty much anything you might like.

This was comparing an A10X at 2.35GHz and with 4GiB of RAM (this may be significant if the OS had to engage in a lot of page compression/decompression) against a Mac Pro with quadcore i7 at, I think, 3.6GHz or so and 16GiB.

I ran through one after another of the benchmarks in the Mathematica Benchmark[] suite, which is not ideal (very much numerics biased, not much testing of symbolics) and found results tended to be of three types.

Some were pretty much equivalent in performance to Intel (slightly slower, but rather better than the GHz ratio). Some of these were the generation of random numbers, and sorting.

Then there some things were 5 to 10x worse. These seem to be linear algebra of some sort, and while we know that on Intel these automatically use all cores and full vectors, it's quite likely that the code to do that has not been completely ported yet.

You'd expect a factor of 4/3 from cores, 4/3 from vector FLOPs available, and about 3/2 from GHz (if the cores can all turbo to max...) so worst case of about 3 from HW, but there seems to be an addition 2 to 4x of software problem.

When I have time I'll look into this further (perhaps Wolfram hasn't even implemented the dense matrix representation on ARM yet?)

Finally some stuff was just crazy slow, even worse than 10x, and clearly was not linear algebra; things like calculating the gamma function of large numbers, or Pi to a large number of decimal places. I think the issue here is that they're using an absolutely awful multi-precision library and it's obviously costing them.

I'll keep trying various things as I have time, but I'd say if you want to be optimistic, there's a lot to be optimistic about in the results that do well. (I do want to try a lot more symbolics, which I can't imagine are in any way strongly tied to a particular CPU.)

On the other hand it's also true that we're seeing the costs of ARM being such a new platform in the performance space, and the costs of not having decent libraries for things like linear algebra and multi-precision. (Apple certainly has linear algebra libraries, but Mathematica may avoid those because they want absolute control over precision? And I don't think Apple provide multi-precision libraries.)

Presumably this stuff has to get written over the next few years if ARM wants to be serious in HPC (which is what SVE is supposedly all about). But is ARM going to write things like MKL, make it available, and provide serious optimization assistance? And how does that work get split between ARM vs QC vs Samsung (not to mention Apple) all providing different micro-architectures? Obviously this is a point people like Linus raise repeatedly, and it's a fair point to keep raising, right up until the problem is solved.

Unfortunately I have no idea the extent to which Wolfram take this iPad Player seriously, or if it's just someone's toy project. Certainly the existing UI is extremely bare bones; it doesn't feel like massive resources have been poured into it.

So I've no idea what we can expect in terms of when we might hope to see implementing these low level libraries...

The Player certainly is nice for manipulating graphs and 3D images (and that's of course, what it was designed for). And certainly for "normal" usages it seems to be as fast as it needs to be. But Mathematica really is a tool that can suck up every cycle you can throw at it, so I hope that people WILL start trying to manipulate complex objects and expressions, and complaining to Wolfram about the severe slowdowns they hit on certain types of calculations.

It's a substantial hassle to do serious benchmarking comparisons between the devices because the Player is firstly not an editable environment (you have to wrap everything you want to be timed inside some sort of Manipulate[], and every change requires a cycle of save on the Mac, delete on the iPad, reload on the iPad), secondly Wolfram have made the decision (for whatever reason? or it's a bug? or it's an iOS power-saving thing?) that any calculation that takes longer than around 5 seconds goes into some sort of neverland where the UI doesn't tell you what's happened, but the calculation is never going to complete.

This is turn means that it's difficult to create a reasonable pool of items to test all at once, you have to step your way trying each thing at a time and feeling out how large a problem can grow before it dies.

Even so, one can learn some interesting things in just a few hours, with results that can be interpreted as meaning pretty much anything you might like.

This was comparing an A10X at 2.35GHz and with 4GiB of RAM (this may be significant if the OS had to engage in a lot of page compression/decompression) against a Mac Pro with quadcore i7 at, I think, 3.6GHz or so and 16GiB.

I ran through one after another of the benchmarks in the Mathematica Benchmark[] suite, which is not ideal (very much numerics biased, not much testing of symbolics) and found results tended to be of three types.

Some were pretty much equivalent in performance to Intel (slightly slower, but rather better than the GHz ratio). Some of these were the generation of random numbers, and sorting.

Then there some things were 5 to 10x worse. These seem to be linear algebra of some sort, and while we know that on Intel these automatically use all cores and full vectors, it's quite likely that the code to do that has not been completely ported yet.

You'd expect a factor of 4/3 from cores, 4/3 from vector FLOPs available, and about 3/2 from GHz (if the cores can all turbo to max...) so worst case of about 3 from HW, but there seems to be an addition 2 to 4x of software problem.

When I have time I'll look into this further (perhaps Wolfram hasn't even implemented the dense matrix representation on ARM yet?)

Finally some stuff was just crazy slow, even worse than 10x, and clearly was not linear algebra; things like calculating the gamma function of large numbers, or Pi to a large number of decimal places. I think the issue here is that they're using an absolutely awful multi-precision library and it's obviously costing them.

I'll keep trying various things as I have time, but I'd say if you want to be optimistic, there's a lot to be optimistic about in the results that do well. (I do want to try a lot more symbolics, which I can't imagine are in any way strongly tied to a particular CPU.)

On the other hand it's also true that we're seeing the costs of ARM being such a new platform in the performance space, and the costs of not having decent libraries for things like linear algebra and multi-precision. (Apple certainly has linear algebra libraries, but Mathematica may avoid those because they want absolute control over precision? And I don't think Apple provide multi-precision libraries.)

Presumably this stuff has to get written over the next few years if ARM wants to be serious in HPC (which is what SVE is supposedly all about). But is ARM going to write things like MKL, make it available, and provide serious optimization assistance? And how does that work get split between ARM vs QC vs Samsung (not to mention Apple) all providing different micro-architectures? Obviously this is a point people like Linus raise repeatedly, and it's a fair point to keep raising, right up until the problem is solved.

Unfortunately I have no idea the extent to which Wolfram take this iPad Player seriously, or if it's just someone's toy project. Certainly the existing UI is extremely bare bones; it doesn't feel like massive resources have been poured into it.

So I've no idea what we can expect in terms of when we might hope to see implementing these low level libraries...

The Player certainly is nice for manipulating graphs and 3D images (and that's of course, what it was designed for). And certainly for "normal" usages it seems to be as fast as it needs to be. But Mathematica really is a tool that can suck up every cycle you can throw at it, so I hope that people WILL start trying to manipulate complex objects and expressions, and complaining to Wolfram about the severe slowdowns they hit on certain types of calculations.

Topic | Posted By | Date |
---|---|---|

Mathematica on iPad | Maynard Handley | 2017/10/20 01:34 AM |

Mathematica on iPad | dmcq | 2017/10/20 07:26 AM |

Mathematica on iPad | Maynard Handley | 2017/10/20 01:41 PM |

Mathematica on iPad | Maynard Handley | 2017/10/20 08:16 PM |

Does this give better formatting? | Maynard Handley | 2017/10/20 08:20 PM |

Does this give better formatting? | anon | 2017/10/20 09:37 PM |

Does this give better formatting? | Maynard Handley | 2017/10/20 10:29 PM |

Does this give better formatting? | anon | 2017/10/21 12:52 AM |

Does this give better formatting? | Maynard Handley | 2017/10/21 09:48 AM |

Does this give better formatting? | anon | 2017/10/21 10:01 AM |

Mathematica on iPad | Adrian | 2017/10/21 01:49 AM |

Sorry for the typo | Adrian | 2017/10/21 01:51 AM |

Mathematica on iPad | dmcq | 2017/10/21 07:03 AM |

Mathematica on iPad | Maynard Handley | 2017/10/21 09:58 AM |

Mathematica on iPad | Wilco | 2017/10/21 07:16 AM |

Mathematica on iPad | Doug S | 2017/10/21 09:02 AM |

Mathematica on iPad | Megol | 2017/10/22 05:24 AM |

clang __builtin_addcll | Michael S | 2017/10/21 11:05 AM |

Mathematica on iPad | Maynard Handley | 2017/10/21 09:55 AM |

Mathematica on iPad | Anon | 2017/10/21 04:20 PM |

Mathematica on iPad | Maynard Handley | 2017/10/21 05:51 PM |

Mathematica on iPad | Anon | 2017/10/21 09:56 PM |

Mathematica on iPad | Maynard Handley | 2017/10/22 12:23 AM |

A quick search shows that Mathematica is using Intel MKL | Gabriele Svelto | 2017/10/21 11:38 PM |

A quick search shows that Mathematica is using Intel MKL | Anon | 2017/10/22 05:12 PM |

A quick search shows that Mathematica is using Intel MKL | Maynard Handley | 2017/10/22 06:08 PM |

A quick search shows that Mathematica is using Intel MKL | Doug S | 2017/10/22 10:40 PM |

A quick search shows that Mathematica is using Intel MKL | Michael S | 2017/10/23 05:32 AM |

Mathematica on iPad | none | 2017/10/22 06:06 AM |

Mathematica on iPad | dmcq | 2017/10/23 03:43 AM |