By: Michael S (already5chosen.delete@this.yahoo.com), March 19, 2008 4:13 pm
Room: Moderated Discussions
Ian Ollmann (iano@apple.com) on 3/19/08 wrote:
---------------------------
>Doug Siebert (foo@bar.bar) on 3/18/08 wrote:
>---------------------------
>>Another thing I was interested to see in Nehalem was the SSE 4.2 support for string
>>instructions. Somehow that doesn't surprise me, if for no other reason than the
>>Google vs. Microsoft War of the Datacenters.
>
>What surprises me is that they chose to release them under the SSE name. None of
>these are vector instructions. (Maybe they are long vectors?) Certainly Intel has
>introduced non-vector ISA additions in SSE before (FISTTP in SSE3) but that was
>one scalar instruction out of many vector instructions. They seem to have no vehicle
>other than SSE to introduce new ISA.
>
5 out of 7 SSE 4.2 instructions (4 text/string processing ops + 1 Packed Comparison op) operate on 128-bit XMM registers in typical SIMD manner exactly like a bulk of older SSEx. So they are called SSE by right.
The remaining couple of instructions (CRC32 and POPCNT) operates on GPRs and really have nothing to do with SIMD. The presence of POPCNT is marked by separate bit in CPUID so it is likely that in the final documentation POPCNT would not be considered a part of SSE 4.2
---------------------------
>Doug Siebert (foo@bar.bar) on 3/18/08 wrote:
>---------------------------
>>Another thing I was interested to see in Nehalem was the SSE 4.2 support for string
>>instructions. Somehow that doesn't surprise me, if for no other reason than the
>>Google vs. Microsoft War of the Datacenters.
>
>What surprises me is that they chose to release them under the SSE name. None of
>these are vector instructions. (Maybe they are long vectors?) Certainly Intel has
>introduced non-vector ISA additions in SSE before (FISTTP in SSE3) but that was
>one scalar instruction out of many vector instructions. They seem to have no vehicle
>other than SSE to introduce new ISA.
>
5 out of 7 SSE 4.2 instructions (4 text/string processing ops + 1 Packed Comparison op) operate on 128-bit XMM registers in typical SIMD manner exactly like a bulk of older SSEx. So they are called SSE by right.
The remaining couple of instructions (CRC32 and POPCNT) operates on GPRs and really have nothing to do with SIMD. The presence of POPCNT is marked by separate bit in CPUID so it is likely that in the final documentation POPCNT would not be considered a part of SSE 4.2
Topic | Posted By | Date |
---|---|---|
Nehalem Architecture: Improvements Detailed | Blut Aus Nord | 2008/03/17 02:52 PM |
Nehalem Architecture: Improvements Detailed | bah | 2008/03/17 04:45 PM |
Nehalem Architecture: Improvements Detailed | Linus Torvalds | 2008/03/17 06:14 PM |
Nehalem Architecture: Improvements Detailed | Gabriele Svelto | 2008/03/18 01:11 AM |
Nehalem Architecture: Improvements Detailed | Henrik S | 2008/03/18 04:23 AM |
Nehalem Architecture: Improvements Detailed | Doug Siebert | 2008/03/18 09:48 PM |
Nehalem Architecture: Improvements Detailed | anon | 2008/03/18 10:37 PM |
Nehalem Architecture: Improvements Detailed | Doug Siebert | 2008/03/19 05:23 PM |
Nehalem Architecture: Improvements Detailed | Ian Ollmann | 2008/03/19 08:15 AM |
SSE 4.2 | Michael S | 2008/03/19 04:13 PM |
SSE 4.2 | Ian Ollmann | 2008/03/20 09:56 AM |
SSE 4.2 | anonymous | 2008/03/20 12:29 PM |
SSE 4.2 | David W. Hess | 2008/03/21 07:24 AM |
SSE 4.2 | anonymous | 2008/03/22 07:27 AM |
CMPXCHG latency | David Kanter | 2008/03/28 05:59 PM |
CMPXCHG latency | anonymous coward | 2008/03/28 10:24 PM |
CMPXCHG latency | David Kanter | 2008/03/28 10:26 PM |
CMPXCHG latency | Linus Torvalds | 2008/03/29 11:43 AM |
CMPXCHG latency | David W. Hess | 2008/03/29 11:56 AM |
CMPXCHG latency | Linus Torvalds | 2008/03/29 02:17 PM |
CMPXCHG latency | Gabriele Svelto | 2008/03/31 12:25 AM |
CMPXCHG latency | Michael S | 2008/03/31 12:38 AM |
CMPXCHG latency | nick | 2008/03/31 12:52 AM |
CMPXCHG latency | Michael S | 2008/03/31 01:51 AM |
CMPXCHG latency | Gabriele Svelto | 2008/03/31 02:08 AM |
CMPXCHG latency | nick | 2008/03/31 07:20 PM |
CMPXCHG latency | Michael S | 2008/04/01 01:14 AM |
CMPXCHG latency | nick | 2008/04/01 02:34 AM |
CMPXCHG latency | Linus Torvalds | 2008/03/31 10:16 AM |
CMPXCHG latency | Aaron Spink | 2008/03/31 07:15 PM |
CMPXCHG latency | nick | 2008/03/31 07:34 PM |
CMPXCHG latency | Linus Torvalds | 2008/04/01 08:25 AM |
CMPXCHG latency | Zan | 2008/04/01 09:54 PM |
CMPXCHG latency | Zan | 2008/04/02 12:11 AM |
CMPXCHG latency | Linus Torvalds | 2008/04/02 08:04 AM |
CMPXCHG latency | Zan | 2008/04/02 11:02 AM |
CMPXCHG latency | Linus Torvalds | 2008/04/02 12:02 PM |
CMPXCHG latency | Zan | 2008/04/02 04:15 PM |
CMPXCHG latency | Michael S | 2008/04/01 01:26 AM |
CMPXCHG latency | Linus Torvalds | 2008/04/01 07:08 AM |
CMPXCHG latency - Intel source | Wouter Tinus | 2008/04/02 12:36 PM |
CMPXCHG latency - Intel source | Linus Torvalds | 2008/04/02 02:21 PM |
CMPXCHG latency - Intel source | David Kanter | 2008/04/02 02:39 PM |
Nehalem Architecture: Improvements Detailed | Philip Honermann | 2008/03/19 01:11 PM |
Nehalem Architecture: Improvements Detailed | Linus Torvalds | 2008/03/19 01:43 PM |
CMPXCHG - all or nothing | Michael S | 2008/03/19 03:49 PM |
multithreading - all or nothing | no@thanks.com | 2008/03/19 05:17 PM |
CMPXCHG - all or nothing | Linus Torvalds | 2008/03/19 05:21 PM |
CMPXCHG - all or nothing | Michael S | 2008/03/20 06:38 AM |
CMPXCHG - all or nothing | Linus Torvalds | 2008/03/20 08:45 AM |
CMPXCHG - all or nothing | Michael S | 2008/03/21 07:08 AM |
CMPXCHG - all or nothing | Linus Torvalds | 2008/03/21 08:47 AM |
CMPXCHG - all or nothing | Henrik S | 2008/03/20 10:09 AM |
CMPXCHG - all or nothing | Linus Torvalds | 2008/03/20 10:53 AM |
CMPXCHG - all or nothing | Henrik S | 2008/03/20 12:03 PM |
CMPXCHG - all or nothing | Linus Torvalds | 2008/03/20 01:12 PM |
CMPXCHG - all or nothing | Henrik S | 2008/03/21 12:13 AM |
CMPXCHG - all or nothing | Gabriele Svelto | 2008/03/21 01:22 AM |
Nehalem Architecture: Improvements Detailed | Philip Honermann | 2008/03/19 06:28 PM |
Nehalem Architecture: Improvements Detailed | Linus Torvalds | 2008/03/19 07:42 PM |
Nehalem Architecture: Improvements Detailed | Philip Honermann | 2008/03/20 06:03 PM |
Nehalem Architecture: Improvements Detailed | Linus Torvalds | 2008/03/20 06:33 PM |
Nehalem Architecture: Improvements Detailed | Philip Honermann | 2008/03/25 06:37 AM |
Nehalem Architecture: Improvements Detailed | Linus Torvalds | 2008/03/25 08:52 AM |
What is DCAS? (NT) | David Kanter | 2008/03/25 10:13 AM |
Double compare-and-exchange | Henrik S | 2008/03/25 10:57 AM |
Double compare-and-exchange | Linus Torvalds | 2008/03/25 11:38 AM |
Double compare-and-exchange | savantu | 2008/03/25 01:54 PM |
Double compare-and-exchange | Linus Torvalds | 2008/03/25 04:09 PM |
Double compare-and-exchange | Jamie Lucier | 2008/03/25 08:55 PM |
Double compare-and-exchange | savantu | 2008/03/25 09:15 PM |
Double compare-and-exchange | Henrik S | 2008/03/26 08:40 AM |
Double compare-and-exchange | Arun Ramakrishnan | 2008/03/27 02:07 AM |
Double compare-and-exchange | Henrik S | 2008/03/27 04:45 AM |
Surely GPL applies ? | Richard Cownie | 2008/03/26 10:05 AM |
Surely GPL applies ? | anon | 2008/03/26 02:58 PM |
Surely GPL applies ? | Paul | 2008/03/26 05:01 PM |
Double compare-and-exchange | someone | 2008/03/25 09:18 PM |
Double compare-and-exchange | Arun Ramakrishnan | 2008/03/27 02:03 AM |
Double compare-and-exchange | savantu | 2008/03/27 03:01 AM |
Double compare-and-exchange | Arun Ramakrishnan | 2008/03/30 09:09 AM |
Double compare-and-exchange | savantu | 2008/03/30 09:59 AM |
Double compare-and-exchange | Linus Torvalds | 2008/03/26 10:50 AM |
Double compare-and-exchange | anon | 2008/03/26 04:47 PM |
Double compare-and-exchange | Paul | 2008/03/26 05:07 PM |
Double compare-and-exchange | Howard Chu | 2008/03/25 05:18 PM |
Nehalem Architecture: Improvements Detailed | Mr. Camel | 2008/03/17 08:50 PM |
Nehalem Architecture: Improvements Detailed | anonymous | 2008/03/17 09:20 PM |
TFP will finally come :-) | Paul A. Clayton | 2008/03/18 12:56 PM |
Nehalem Architecture: Improvements Detailed | IntelUser2000 | 2008/03/27 07:46 PM |
Nehalem Architecture: Improvements Detailed | David Kanter | 2008/03/27 10:21 PM |
Nehalem Architecture: Improvements Detailed | nick | 2008/03/27 11:06 PM |
Nehalem Architecture: Improvements Detailed | David Kanter | 2008/03/28 02:45 PM |
Nehalem Architecture: Improvements Detailed | nick | 2008/03/28 07:52 PM |
L1 I-cache | puzzled | 2008/04/01 07:53 AM |
L1 I-cache | S. Rao | 2008/04/01 09:47 AM |
L1 I-cache | rwessel | 2008/04/01 12:23 PM |
L1 I-cache | Gabriele Svelto | 2008/04/03 12:30 AM |