User:Bjacob/ArithmeticTimingDifferences

From MozillaWiki
< User:Bjacob
Revision as of 14:24, 3 June 2013 by Bjacob (talk | contribs) (Created page with "This page is about which scalar floating-point arithmetic instructions fail to execute in constant time (independently of their operands) on various CPUs. We also study some m...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page is about which scalar floating-point arithmetic instructions fail to execute in constant time (independently of their operands) on various CPUs. We also study some multiple-instructions constructs such as math functions.

Terminology

x86 architecture

Instruction sets

In the x86 architecture, there are two competing floating-point arithmetic instruction sets: the legacy x87 instruction set, and SSE.

While SSE offers SIMD instructions, it also provides a full replacement for old x87 scalar (MIMD) instructions. We are only concerned with scalar instructions in this document.

Here we use SSE to generically refer to all versions of SSE instruction sets. The original SSE instruction set (hereafter SSE1) was introduced in the Pentium III and only handles floats. The subsequent SSE2 instruction set, introduced in the Pentium 4, added support for doubles. We are not concerned with SSE3 and newer instruction sets in this document.

Handling of denormals

x86 CPUs have two distinct flags that can optionally be enabled to enable non-IEEE-compliant handling of denormals:

  • FTZ (Flush To Zero) causes any denormal result to be flushed to zero.
  • DAZ (Denormals Are Zero) causes any denormal operand to be handled as if it were zero.

Not all CPUs have these flags. Even CPUs with SSE2 instructions can fail to support DAZ. For example, the Pentium M has SSE2 but does not have DAZ. It is reported that some early Pentium 4 are in the same case.

x86-64 architecture

The only way in which x86-64 differs from x86-32 in these matters, is that SSE2 is unconditionally available on x86-64. Hereafter, we let x86 refer generically to x86-32 or x86-64.

ARM architecture

Instruction sets

ARM has one hardware, scalar, floating-point instruction set: VFP.

NEON is purely a SIMD instruction set, so it doesn't concern us here.

Many ARM CPUs do not have NEON, and some ARM CPUs do not even have VFP. On these, software emulation of floating-point arithmetic is used. For our purposes, this can be considered another separate instruction set.

Handling of denormals

ARM has a single flag, Flush To Zero (FZ) which plays the role of both of x86's flags, FTZ and DAZ.

Methodology

The code performing the measurements is on github.