SIMD/Uses/SAD: Difference between revisions

Jump to navigation Jump to search
Line 20: Line 20:


== Code example ==
== Code example ==
The excerpts are part of a public-domain MPEG-1 video encoder and can be found in [https://github.com/maikmerten/mpeg/blob/master/me.c https://github.com/maikmerten/mpeg/blob/master/me.c], which contains all motion estimation routines.
The following C code computes the SAD for 16 consecutive pixels (one line in a 16x16 block). The pointers "bptr" and "cptr" point to pixels values (8 bit unsigned) in the current and a preceding frame. The variable "error", which is the accumulator for the SAD score, is initialized as zero.
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
residue=(*(bptr++)-*(cptr++));
if (residue<0) {error-=residue;} else {error+=residue;}
This code only computes the difference between two pixel values at a time. It also is very branchy, given that the equivalent to Math.abs() is implemented with conditional statements.
The same computation can be done in a much more straightforward way with SSE2. In this example, [http://en.wikipedia.org/wiki/Intrinsic_function SSE2 intrinsics] are used.
a = _mm_loadu_si128((__m128i *) bptr);
b = _mm_loadu_si128((__m128i *) cptr);
a = _mm_sad_epu8(a, b);
b = _mm_srli_si128(a, 8);
a = _mm_add_epi32(a, b);
error += _mm_cvtsi128_si32(a);
51

edits

Navigation menu