By: -.- (blarg.delete@this.mailinator.com), May 23, 2022 4:28 am
Room: Moderated Discussions
Charlie Burnes (charlie.burnes.delete@this.no-spam.com) on May 22, 2022 6:30 am wrote:
> It’s not possible because the SIMD width is mapped to a rectangle of values and a very expensive computation
> has to be done to compute some constants that correspond to each value in that rectangle. This expensive
> computation only has to be done once for a particular SIMD width but the SIMD width needs to be known
> to do the expensive computation because the SIMD width determines the size of the rectangle of values.
I don't know your problem, but would it be possible to just do 128-bit "rectangles", but process multiple rectangles at a time?
If so, this maps neatly to SVE, as well as SSE/NEON, and also works nicely for AVX's 128-bit lanes concept.
> It’s not possible because the SIMD width is mapped to a rectangle of values and a very expensive computation
> has to be done to compute some constants that correspond to each value in that rectangle. This expensive
> computation only has to be done once for a particular SIMD width but the SIMD width needs to be known
> to do the expensive computation because the SIMD width determines the size of the rectangle of values.
I don't know your problem, but would it be possible to just do 128-bit "rectangles", but process multiple rectangles at a time?
If so, this maps neatly to SVE, as well as SSE/NEON, and also works nicely for AVX's 128-bit lanes concept.