Floating-point math, often used to represent decimal numbers in programming, introduces complexities when comparing values due to its inherent inexactness. This article delves into the nuances of comparing floating-point numbers, specifically addressing whether the greater than operator (>) performs comparisons as intended and exploring alternative techniques for robust comparisons.
The Problem with Direct Comparisons
Representing decimal values like 0.1 perfectly in binary floating-point format is often impossible. Limited precision and rounding errors can lead to unexpected outcomes when using direct comparisons with operators like ‘>’. Slight variations in operations or intermediate precision can alter results, making seemingly straightforward comparisons unreliable.
For example:
float f = 0.1f;
float sum = 0;
for (int i = 0; i < 10; ++i)
sum += f;
float product = f * 10;
printf("sum = %1.15f, product = %1.15fn", sum, product);
This code attempts to calculate ‘one’ in two ways. Due to rounding errors, sum
(repeated addition) might not precisely equal product
(multiplication), even though mathematically they should be equivalent. This demonstrates why direct comparison using ‘>’ can be problematic.
Epsilon Comparisons: Defining “Close Enough”
To address the inexactness of floating-point math, epsilon comparisons introduce a tolerance range. Two numbers are considered equal if their absolute difference is within a defined epsilon (ε) value:
bool isEqual = fabs(f1 - f2) <= epsilon;
However, choosing an appropriate epsilon is crucial. A fixed epsilon like FLT_EPSILON
(the smallest representable difference between two floats) might be suitable for values around 1.0, but becomes inadequate for smaller or larger numbers. For numbers significantly larger than 2.0, using FLT_EPSILON
becomes equivalent to a direct equality check.
Relative Epsilon Comparisons: Scaling with Magnitude
Relative epsilon comparisons address the limitations of fixed epsilons by scaling the tolerance with the magnitude of the compared numbers:
bool AlmostEqualRelative(float A, float B, float maxRelDiff) {
float diff = fabs(A - B);
float largest = (fabs(B) > fabs(A)) ? fabs(B) : fabs(A);
return diff <= largest * maxRelDiff;
}
This approach compares the difference to a percentage of the larger number, providing a more consistent comparison across different magnitudes. Setting maxRelDiff
to FLT_EPSILON
or a small multiple thereof often yields reasonable results.
ULP Comparisons: Units in the Last Place
An alternative technique involves comparing the difference in Units in the Last Place (ULPs). ULPs represent the distance between two floating-point numbers in terms of the representable values. Subtracting the integer representations of two floats reveals their ULPs difference. A smaller ULP difference signifies closer proximity.
bool AlmostEqualUlps(float A, float B, int maxUlpsDiff); // Implementation details omitted for brevity
ULP comparisons offer an intuitive measure of floating-point proximity. However, they are sensitive to sign differences and exhibit peculiarities near zero and with special values like infinity and NaN.
Addressing Zero and Catastrophic Cancellation
Comparing values near zero requires special consideration. Relative comparisons break down because even tiny differences become significant relative to zero. Absolute epsilon comparisons, with carefully chosen tolerances, are necessary in such cases. Catastrophic cancellation, where subtracting nearly equal numbers leads to significant loss of precision, necessitates understanding the underlying calculation and potential error sources. For example, sin(pi)
calculated with floating-point numbers might not result in exactly zero due to the inexact representation of pi, even though mathematically it should.
Conclusion
While the greater than operator (>) functions on floating-point numbers, direct comparisons often lead to unreliable results due to inherent precision limitations. Techniques like epsilon comparisons (both absolute and relative) and ULP comparisons provide more robust alternatives by defining acceptable tolerance ranges. However, understanding the specific context, potential for catastrophic cancellation, and behavior near zero is crucial for choosing the most appropriate comparison method and ensuring accurate results when working with floating-point numbers. Always consider the nature of the calculation and expected error margins to avoid unexpected behavior and ensure the reliability of your comparisons.