Arithmetic Operations for Streaming SIMD Extensions

The prototypes for Streaming SIMD Extensions (SSE) intrinsics for arithmetic operations are in the xmmintrin.h header file.

The results of each intrinsic operation are placed in a register. This register is illustrated for each intrinsic with R0-R3. R0, R1, R2 and R3 each represent one of the 4 32-bit pieces of the result register.

Intrinsic

Operation

Corresponding
SSE Instruction

_mm_add_ss

Addition

ADDSS

_mm_add_ps

Addition

ADDPS

_mm_sub_ss

Subtraction

SUBSS

_mm_sub_ps

Subtraction

SUBPS

_mm_mul_ss

Multiplication

MULSS

_mm_mul_ps

Multiplication

MULPS

_mm_div_ss

Division

DIVSS

_mm_div_ps

Division

DIVPS

_mm_sqrt_ss

Squared Root

SQRTSS

_mm_sqrt_ps

Squared Root

SQRTPS

_mm_rcp_ss

Reciprocal

RCPSS

_mm_rcp_ps

Reciprocal

RCPPS

_mm_rsqrt_ss

Reciprocal Squared Root

RSQRTSS

_mm_rsqrt_ps

Reciprocal Squared Root

RSQRTPS

_mm_min_ss

Computes Minimum

MINSS

_mm_min_ps

Computes Minimum

MINPS

_mm_max_ss

Computes Maximum

MAXSS

_mm_max_ps

Computes Maximum

MAXPS

 

__m128 _mm_add_ss(__m128 a, __m128 b)

Adds the lower single-precision, floating-point (SP FP) values of a and b; the upper 3 SP FP values are passed through from a.

R0

R1

R2

R3

a0 + b0

a1

a2

a3

 

__m128 _mm_add_ps(__m128 a, __m128 b)

Adds the four SP FP values of a and b.

R0

R1

R2

R3

a0 +b0

a1 + b1

a2 + b2

a3 + b3

 

__m128 _mm_sub_ss(__m128 a, __m128 b)

Subtracts the lower SP FP values of a and b. The upper 3 SP FP values are passed through from a.

R0

R1

R2

R3

a0 - b0

a1

a2

a3

 

__m128 _mm_sub_ps(__m128 a, __m128 b)

Subtracts the four SP FP values of a and b.

R0

R1

R2

R3

a0 - b0

a1 - b1

a2 - b2

a3 - b3

 

__m128 _mm_mul_ss(__m128 a, __m128 b)

Multiplies the lower SP FP values of a and b; the upper 3 SP FP values are passed through from a.

R0

R1

R2

R3

a0 * b0

a1

a2

a3

 

__m128 _mm_mul_ps(__m128 a, __m128 b)

Multiplies the four SP FP values of a and b.

R0

R1

R2

R3

a0 * b0

a1 * b1

a2 * b2

a3 * b3

 

__m128 _mm_div_ss(__m128 a, __m128 b )

Divides the lower SP FP values of a and b; the upper 3 SP FP values are passed through from a.

R0

R1

R2

R3

a0 / b0

a1

a2

a3

 

__m128 _mm_div_ps(__m128 a, __m128 b)

Divides the four SP FP values of a and b.

R0

R1

R2

R3

a0 / b0

a1 / b1

a2 / b2

a3 / b3

 

__m128 _mm_sqrt_ss(__m128 a)

Computes the square root of the lower SP FP value of a ; the upper 3 SP FP values are passed through.

R0

R1

R2

R3

sqrt(a0)

a1

a2

a3

 

__m128 _mm_sqrt_ps(__m128 a)

Computes the square roots of the four SP FP values of a.

R0

R1

R2

R3

sqrt(a0)

sqrt(a1)

sqrt(a2)

sqrt(a3)

 

__m128 _mm_rcp_ss(__m128 a)

Computes the approximation of the reciprocal of the lower SP FP value of a; the upper 3 SP FP values are passed through.

R0

R1

R2

R3

recip(a0)

a1

a2

a3

 

__m128 _mm_rcp_ps(__m128 a)

Computes the approximations of reciprocals of the four SP FP values of a.

R0

R1

R2

R3

recip(a0)

recip(a1)

recip(a2)

recip(a3)

 

__m128 _mm_rsqrt_ss(__m128 a)

Computes the approximation of the reciprocal of the square root of the lower SP FP value of a; the upper 3 SP FP values are passed through.

R0

R1

R2

R3

recip(sqrt(a0))

a1

a2

a3

 

__m128 _mm_rsqrt_ps(__m128 a)

Computes the approximations of the reciprocals of the square roots of the four SP FP values of a.

R0

R1

R2

R3

recip(sqrt(a0))

recip(sqrt(a1))

recip(sqrt(a2))

recip(sqrt(a3))

 

__m128 _mm_min_ss(__m128 a, __m128 b)

Computes the minimum of the lower SP FP values of a and b; the upper 3 SP FP values are passed through from a.

R0

R1

R2

R3

min(a0, b0)

a1

a2

a3

 

__m128 _mm_min_ps(__m128 a, __m128 b)

Computes the minimum of the four SP FP values of a and b.

R0

R1

R2

R3

min(a0, b0)

min(a1, b1)

min(a2, b2)

min(a3, b3)

 

__m128 _mm_max_ss(__m128 a, __m128 b)

Computes the maximum of the lower SP FP values of a and b; the upper 3 SP FP values are passed through from a.

R0

R1

R2

R3

max(a0, b0)

a1

a2

a3

 

__m128 _mm_max_ps(__m128 a, __m128 b)

Computes the maximum of the four SP FP values of a and b.

R0

R1

R2

R3

max(a0, b0)

max(a1, b1)

max(a2, b2)

max(a3, b3)