By: dmcq (dmcq.delete@this.fano.co.uk), October 15, 2021 1:41 am
Room: Moderated Discussions
Doug S (foo.delete@this.bar.bar) on October 14, 2021 10:23 pm wrote:
> rwessel (rwessel.delete@this.yahoo.com) on October 14, 2021 7:07 pm wrote:
> > While dealing with that on-die can be dealt with by just not putting two incompatible cores on a
> > die, that leaves the intra-cluster and VM migration issues with obvious solution. You could demand
> > that the cluster (or potential VM migration targets) all have the same specs in this regard, but
> > that seems a bit painful. The instruction definitions appear to define exceptions for when things
> > like that happen, and so fixing it up in software ought to be possible. Still a bit ugly.
>
>
> I'm tempted to say that if you choose to have incompatible
> ARM hardware in a VM cluster you deserve what you get...
>
> At any rate that's something for the hypervisor to address - it should have a way to enforce or at least define
> the "lowest common denominator" for this any other hardware differences that might arise. This is potentially
> a much larger problem as you can have differences in cache line size, page size, ARMvX.Y version etc.
>
> This isn't limited to ARM. You might run into potential issues if you mixed Intel and AMD hardware
> in a single VM cluster. I'd say you get what you deserve if you try it and experience issues.
I think Arm designed their cache clear operations wrong or they should have limited its use to where they could conrol it better. If they returned how many bytes were cleared instead of depending on something else that returned cache size for instance that would have removed the problem. I very much hope they haven't set themselves up for the same sort of problem with the CPY and SET operations. It would be so simple to avoid them at this stage without affecting their performance. It really depends on what happens when the operations are interrupted. Having three operations is a mess I think but I'll try and figure it out.
> rwessel (rwessel.delete@this.yahoo.com) on October 14, 2021 7:07 pm wrote:
> > While dealing with that on-die can be dealt with by just not putting two incompatible cores on a
> > die, that leaves the intra-cluster and VM migration issues with obvious solution. You could demand
> > that the cluster (or potential VM migration targets) all have the same specs in this regard, but
> > that seems a bit painful. The instruction definitions appear to define exceptions for when things
> > like that happen, and so fixing it up in software ought to be possible. Still a bit ugly.
>
>
> I'm tempted to say that if you choose to have incompatible
> ARM hardware in a VM cluster you deserve what you get...
>
> At any rate that's something for the hypervisor to address - it should have a way to enforce or at least define
> the "lowest common denominator" for this any other hardware differences that might arise. This is potentially
> a much larger problem as you can have differences in cache line size, page size, ARMvX.Y version etc.
>
> This isn't limited to ARM. You might run into potential issues if you mixed Intel and AMD hardware
> in a single VM cluster. I'd say you get what you deserve if you try it and experience issues.
I think Arm designed their cache clear operations wrong or they should have limited its use to where they could conrol it better. If they returned how many bytes were cleared instead of depending on something else that returned cache size for instance that would have removed the problem. I very much hope they haven't set themselves up for the same sort of problem with the CPY and SET operations. It would be so simple to avoid them at this stage without affecting their performance. It really depends on what happens when the operations are interrupted. Having three operations is a mess I think but I'll try and figure it out.