Confirmed users
753
edits
No edit summary |
|||
Line 105: | Line 105: | ||
An example is a [http://people.mozilla.org/~bjacob/ta/results/intel-celeron-1.8GHz-family-15-model-2-stepping-7-toshiba-satellite/ Pentium-4-era Celeron] exhibiting NaN-related slow behavior with SSE2 instructions, specifically on floating-point multiplications where one of the operands is NaN. See the alert about float multiplication and about double multiplication in [http://people.mozilla.org/~bjacob/ta/results/intel-celeron-1.8GHz-family-15-model-2-stepping-7-toshiba-satellite/time-math-x86-32-sse2-DAZ-FTZ.txt this log]. | An example is a [http://people.mozilla.org/~bjacob/ta/results/intel-celeron-1.8GHz-family-15-model-2-stepping-7-toshiba-satellite/ Pentium-4-era Celeron] exhibiting NaN-related slow behavior with SSE2 instructions, specifically on floating-point multiplications where one of the operands is NaN. See the alert about float multiplication and about double multiplication in [http://people.mozilla.org/~bjacob/ta/results/intel-celeron-1.8GHz-family-15-model-2-stepping-7-toshiba-satellite/time-math-x86-32-sse2-DAZ-FTZ.txt this log]. | ||
=== Some recent AMD CPUs have | === Some recent AMD CPUs have abnormally ''fast'' comparisons with NaN values, even with SSE2 instructions === | ||
This is another way in why the notion that NaN issues are all solved by switching to SSE2, is not quite true. | This is another way in why the notion that NaN issues are all solved by switching to SSE2, is not quite true. | ||
A [http://people.mozilla.org/~bjacob/ta/results/amd-fx-8150-family-21-model-1-stepping-2-benwa-bulldozer/ 8-core AMD FX processor] exhibits | A [http://people.mozilla.org/~bjacob/ta/results/amd-fx-8150-family-21-model-1-stepping-2-benwa-bulldozer/ 8-core AMD FX processor] exhibits abnormally ''fast'' single-precision floating-point equality comparisons (ucomiss / ucomisd instructions) when one of the operands is NaN. See the alerts about float equality comparison in [http://people.mozilla.org/~bjacob/ta/results/amd-fx-8150-family-21-model-1-stepping-2-benwa-bulldozer/time-math-x86-64-DAZ-FTZ.txt this log]. | ||
The SSE instructions being used there are ucomiss and ucomisd. | The SSE instructions being used there are ucomiss and ucomisd. | ||
This is particularly worrying because: | This is particularly worrying because: | ||
* This shows that we have to watch out not just for abnormally ''slow'' but also for abnormally ''fast'' operations. | |||
* This show that Intel and AMD are not making completely parallel progress on these issues. | * This show that Intel and AMD are not making completely parallel progress on these issues. | ||
* Equality comparison is what we would use if we wanted to manually avoid NaN values. So having this not run in constant time means that the idea of manually avoiding specific "bad" values may not be feasible. | * Equality comparison is what we would use if we wanted to manually avoid NaN values. So having this not run in constant time means that the idea of manually avoiding specific "bad" values may not be feasible. |