User:Bjacob/ArithmeticTimingDifferences
This page is about which scalar floating-point arithmetic instructions fail to execute in constant time (independently of their operands) on various CPUs. We also study some multiple-instructions constructs such as math functions.
Terminology
x86 architecture
Instruction sets
In the x86 architecture, there are two competing floating-point arithmetic instruction sets: the legacy x87 instruction set, and SSE.
While SSE offers SIMD instructions, it also provides a full replacement for old x87 scalar (MIMD) instructions. We are only concerned with scalar instructions in this document.
Here we use SSE to generically refer to all versions of SSE instruction sets. The original SSE instruction set (hereafter SSE1) was introduced in the Pentium III and only handles floats. The subsequent SSE2 instruction set, introduced in the Pentium 4, added support for doubles. We are not concerned with SSE3 and newer instruction sets in this document.
Handling of denormals
x86 CPUs have two distinct flags that can optionally be enabled to enable non-IEEE-compliant handling of denormals:
- FTZ (Flush To Zero) causes any denormal result to be flushed to zero.
- DAZ (Denormals Are Zero) causes any denormal operand to be handled as if it were zero.
Not all CPUs have these flags. Even CPUs with SSE2 instructions can fail to support DAZ. For example, the Pentium M has SSE2 but does not have DAZ. It is reported that some early Pentium 4 are in the same case.
x86-64 architecture
The only way in which x86-64 differs from x86-32 in these matters, is that SSE2 is unconditionally available on x86-64. Hereafter, we let x86 refer generically to x86-32 or x86-64.
ARM architecture
Instruction sets
ARM has one hardware, scalar, floating-point instruction set: VFP.
NEON is purely a SIMD instruction set, so it doesn't concern us here.
Many ARM CPUs do not have NEON, and some ARM CPUs do not even have VFP. On these, software emulation of floating-point arithmetic is used. For our purposes, this can be considered another separate instruction set.
Handling of denormals
ARM has a single flag, Flush To Zero (FZ) which plays the role of both of x86's flags, FTZ and DAZ.
Methodology
The code performing the measurements is on github.