What is the most effective way for float and double comparison?
比较两个
简单地这样做是不正确的:
1 2 3 4 | bool CompareDoubles1 (double A, double B) { return A == B; } |
但有点像:
1 2 3 4 5 | bool CompareDoubles2 (double A, double B) { diff = A - B; return (diff < EPSILON) && (-diff < EPSILON); } |
号
似乎浪费了加工。
有人知道更智能的浮球比较器吗?
使用其他建议时要格外小心。这完全取决于上下文。
我花了很长时间在一个假设
一种算法中的隐式假设,如果
使用相同的epsilon表示以英寸为单位的测线和以mils(.001英寸)为单位的测线。那是
对角的余弦和线的长度使用相同的epsilon!
使用此类比较函数对集合中的项进行排序。(在这种情况下,使用Buffin C++ +运算符==为双打产生正确的结果)。
如我所说:这完全取决于上下文和
顺便说一句,
另外,如果在
与epsilon值的比较是大多数人所做的(即使在游戏编程中)。
不过,您应该稍微改变一下您的实现:
1 2 3 4 | bool AreSame(double a, double b) { return fabs(a - b) < EPSILON; } |
编辑:克里斯特在最近的一篇博客文章中添加了大量关于这个主题的信息。享受。
我发现谷歌C++测试框架包含一个很好的基于跨平台模板的ALMOSTQualQuxSimple实现,它在双工和浮标上都有作用。如果它是根据BSD许可证发布的,那么在您自己的代码中使用它应该没有问题,只要您保留了许可证。我从http://code.google.com/p/googletest/source/browse/trunk/include/gtest/internal/gtest internal.h提取了以下代码,并在顶部添加了许可证。
一定要将gtest-os-windows定义为某个值(或者将使用它的代码更改为适合您的代码库的代码,毕竟它是BSD许可的)。
用法示例:
1 2 3 4 5 6 7 | double left = // something double right = // something const FloatingPoint<double> lhs(left), rhs(right); if (lhs.AlmostEquals(rhs)) { //they're equal! } |
号
代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 | // Copyright 2005, Google Inc. // All rights reserved. // // Redistribution and use in source and binary forms, with or without // modification, are permitted provided that the following conditions are // met: // // * Redistributions of source code must retain the above copyright // notice, this list of conditions and the following disclaimer. // * Redistributions in binary form must reproduce the above // copyright notice, this list of conditions and the following disclaimer // in the documentation and/or other materials provided with the // distribution. // * Neither the name of Google Inc. nor the names of its // contributors may be used to endorse or promote products derived from // this software without specific prior written permission. // // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS //"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR // A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT // OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, // SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT // LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY // THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // // Authors: [email protected] (Zhanyong Wan), [email protected] (Sean Mcafee) // // The Google C++ Testing Framework (Google Test) // This template class serves as a compile-time function from size to // type. It maps a size in bytes to a primitive type with that // size. e.g. // // TypeWithSize<4>::UInt // // is typedef-ed to be unsigned int (unsigned integer made up of 4 // bytes). // // Such functionality should belong to STL, but I cannot find it // there. // // Google Test uses this class in the implementation of floating-point // comparison. // // For now it only handles UInt (unsigned int) as that's all Google Test // needs. Other types can be easily added in the future if need // arises. template <size_t size> class TypeWithSize { public: // This prevents the user from using TypeWithSize<N> with incorrect // values of N. typedef void UInt; }; // The specialization for size 4. template <> class TypeWithSize<4> { public: // unsigned int has size 4 in both gcc and MSVC. // // As base/basictypes.h doesn't compile on Windows, we cannot use // uint32, uint64, and etc here. typedef int Int; typedef unsigned int UInt; }; // The specialization for size 8. template <> class TypeWithSize<8> { public: #if GTEST_OS_WINDOWS typedef __int64 Int; typedef unsigned __int64 UInt; #else typedef long long Int; // NOLINT typedef unsigned long long UInt; // NOLINT #endif // GTEST_OS_WINDOWS }; // This template class represents an IEEE floating-point number // (either single-precision or double-precision, depending on the // template parameters). // // The purpose of this class is to do more sophisticated number // comparison. (Due to round-off error, etc, it's very unlikely that // two floating-points will be equal exactly. Hence a naive // comparison by the == operation often doesn't work.) // // Format of IEEE floating-point: // // The most-significant bit being the leftmost, an IEEE // floating-point looks like // // sign_bit exponent_bits fraction_bits // // Here, sign_bit is a single bit that designates the sign of the // number. // // For float, there are 8 exponent bits and 23 fraction bits. // // For double, there are 11 exponent bits and 52 fraction bits. // // More details can be found at // http://en.wikipedia.org/wiki/IEEE_floating-point_standard. // // Template parameter: // // RawType: the raw floating-point type (either float or double) template <typename RawType> class FloatingPoint { public: // Defines the unsigned integer type that has the same size as the // floating point number. typedef typename TypeWithSize<sizeof(RawType)>::UInt Bits; // Constants. // # of bits in a number. static const size_t kBitCount = 8*sizeof(RawType); // # of fraction bits in a number. static const size_t kFractionBitCount = std::numeric_limits<RawType>::digits - 1; // # of exponent bits in a number. static const size_t kExponentBitCount = kBitCount - 1 - kFractionBitCount; // The mask for the sign bit. static const Bits kSignBitMask = static_cast<Bits>(1) << (kBitCount - 1); // The mask for the fraction bits. static const Bits kFractionBitMask = ~static_cast<Bits>(0) >> (kExponentBitCount + 1); // The mask for the exponent bits. static const Bits kExponentBitMask = ~(kSignBitMask | kFractionBitMask); // How many ULP's (Units in the Last Place) we want to tolerate when // comparing two numbers. The larger the value, the more error we // allow. A 0 value means that two numbers must be exactly the same // to be considered equal. // // The maximum error of a single floating-point operation is 0.5 // units in the last place. On Intel CPU's, all floating-point // calculations are done with 80-bit precision, while double has 64 // bits. Therefore, 4 should be enough for ordinary use. // // See the following article for more details on ULP: // http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm. static const size_t kMaxUlps = 4; // Constructs a FloatingPoint from a raw floating-point number. // // On an Intel CPU, passing a non-normalized NAN (Not a Number) // around may change its bits, although the new value is guaranteed // to be also a NAN. Therefore, don't expect this constructor to // preserve the bits in x when x is a NAN. explicit FloatingPoint(const RawType& x) { u_.value_ = x; } // Static methods // Reinterprets a bit pattern as a floating-point number. // // This function is needed to test the AlmostEquals() method. static RawType ReinterpretBits(const Bits bits) { FloatingPoint fp(0); fp.u_.bits_ = bits; return fp.u_.value_; } // Returns the floating-point number that represent positive infinity. static RawType Infinity() { return ReinterpretBits(kExponentBitMask); } // Non-static methods // Returns the bits that represents this number. const Bits &bits() const { return u_.bits_; } // Returns the exponent bits of this number. Bits exponent_bits() const { return kExponentBitMask & u_.bits_; } // Returns the fraction bits of this number. Bits fraction_bits() const { return kFractionBitMask & u_.bits_; } // Returns the sign bit of this number. Bits sign_bit() const { return kSignBitMask & u_.bits_; } // Returns true iff this is NAN (not a number). bool is_nan() const { // It's a NAN if the exponent bits are all ones and the fraction // bits are not entirely zeros. return (exponent_bits() == kExponentBitMask) && (fraction_bits() != 0); } // Returns true iff this number is at most kMaxUlps ULP's away from // rhs. In particular, this function: // // - returns false if either number is (or both are) NAN. // - treats really large numbers as almost equal to infinity. // - thinks +0.0 and -0.0 are 0 DLP's apart. bool AlmostEquals(const FloatingPoint& rhs) const { // The IEEE standard says that any comparison operation involving // a NAN must return false. if (is_nan() || rhs.is_nan()) return false; return DistanceBetweenSignAndMagnitudeNumbers(u_.bits_, rhs.u_.bits_) <= kMaxUlps; } private: // The data type used to store the actual floating-point number. union FloatingPointUnion { RawType value_; // The raw floating-point number. Bits bits_; // The bits that represent the number. }; // Converts an integer from the sign-and-magnitude representation to // the biased representation. More precisely, let N be 2 to the // power of (kBitCount - 1), an integer x is represented by the // unsigned number x + N. // // For instance, // // -N + 1 (the most negative number representable using // sign-and-magnitude) is represented by 1; // 0 is represented by N; and // N - 1 (the biggest number representable using // sign-and-magnitude) is represented by 2N - 1. // // Read http://en.wikipedia.org/wiki/Signed_number_representations // for more details on signed number representations. static Bits SignAndMagnitudeToBiased(const Bits &sam) { if (kSignBitMask & sam) { // sam represents a negative number. return ~sam + 1; } else { // sam represents a positive number. return kSignBitMask | sam; } } // Given two numbers in the sign-and-magnitude representation, // returns the distance between them as an unsigned number. static Bits DistanceBetweenSignAndMagnitudeNumbers(const Bits &sam1, const Bits &sam2) { const Bits biased1 = SignAndMagnitudeToBiased(sam1); const Bits biased2 = SignAndMagnitudeToBiased(sam2); return (biased1 >= biased2) ? (biased1 - biased2) : (biased2 - biased1); } FloatingPointUnion u_; }; |
编辑:这篇文章4年了。它可能仍然有效,代码也不错,但有些人发现了改进。BestGo从Google测试源代码中得到了最新版本的
比较的浮点数取决于上下文。因为即使改变操作顺序也会产生不同的结果,所以了解数字的"相等程度"是很重要的。
布鲁斯·道森比较浮点数是开始比较浮点数的好地方。
以下定义来自Knuth的计算机编程艺术:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | bool approximatelyEqual(float a, float b, float epsilon) { return fabs(a - b) <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon); } bool essentiallyEqual(float a, float b, float epsilon) { return fabs(a - b) <= ( (fabs(a) > fabs(b) ? fabs(b) : fabs(a)) * epsilon); } bool definitelyGreaterThan(float a, float b, float epsilon) { return (a - b) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon); } bool definitelyLessThan(float a, float b, float epsilon) { return (b - a) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon); } |
当然,选择epsilon取决于上下文,并且决定了您希望数字的相等程度。
比较浮点数的另一种方法是查看数字的ulp(最后一个单位)。虽然没有专门处理比较,但该论文是一个很好的资源,可以帮助每个计算机科学家了解浮点数的工作原理和陷阱,包括什么是ULP。
对于更深入的方法,请阅读比较浮点数。下面是该链接的代码段:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | // Usable AlmostEqual function bool AlmostEqual2sComplement(float A, float B, int maxUlps) { // Make sure maxUlps is non-negative and small enough that the // default NAN won't compare as equal to anything. assert(maxUlps > 0 && maxUlps < 4 * 1024 * 1024); int aInt = *(int*)&A; // Make aInt lexicographically ordered as a twos-complement int if (aInt < 0) aInt = 0x80000000 - aInt; // Make bInt lexicographically ordered as a twos-complement int int bInt = *(int*)&B; if (bInt < 0) bInt = 0x80000000 - bInt; int intDiff = abs(aInt - bInt); if (intDiff <= maxUlps) return true; return false; } |
。
在C++中实现EpSelon的便携方法
1 2 | #include <limits> std::numeric_limits<double>::epsilon() |
然后比较函数变成
1 2 3 4 5 6 | #include <cmath> #include <limits> bool AreSame(double a, double b) { return std::fabs(a - b) < std::numeric_limits<double>::epsilon(); } |
。
意识到这是一条古老的线索,但这篇文章是我在比较浮点数时发现的最直接的线索之一,如果您想了解更多信息,它也有更详细的参考资料,并且它的主站点涵盖了处理浮点数的一整套问题,浮点数指南:比较。
我们可以找到一个更实用的文章,在浮点公差重新审视,并指出有绝对公差测试,这归结为这在C++中:
1 2 3 4 | bool absoluteToleranceCompare(double x, double y) { return std::fabs(x - y) <= std::numeric_limits<double>::epsilon() ; } |
。
相对公差试验:
1 2 3 4 5 | bool relativeToleranceCompare(double x, double y) { double maxXY = std::max( std::fabs(x) , std::fabs(y) ) ; return std::fabs(x - y) <= std::numeric_limits<double>::epsilon()*maxXY ; } |
文章指出,当
1 2 3 4 5 6 | bool combinedToleranceCompare(double x, double y) { double maxXYOne = std::max( { 1.0, std::fabs(x) , std::fabs(y) } ) ; return std::fabs(x - y) <= std::numeric_limits<double>::epsilon()*maxXYOne ; } |
。
您编写的代码被窃听:
1 | return (diff < EPSILON) && (-diff > EPSILON); |
号
正确的代码是:
1 | return (diff < EPSILON) && (diff > -EPSILON); |
(是的,这是不同的)
我想知道Fabs是否不会让你在某些情况下失去懒惰的评价。我认为这取决于编译器。你可能想两个都试试。如果它们的平均值相等,那么就用fabs来实现。
如果你有一些关于两个浮动中哪一个比另一个更可能大的信息,你可以按照比较的顺序进行,以更好地利用懒惰的评估。
最后,通过内嵌这个函数,您可能会得到更好的结果。但不太可能改善…
编辑:OJ,感谢您更正您的代码。我相应地删除了我的评论
`return fabs(a - b) < EPSILON;
号
如果:
- 你输入的数量级变化不大
- 极少数相反的符号可以被视为相等的。
否则会给你带来麻烦。双精度数字的分辨率约为16位小数。如果你比较的两个数字的大小大于epsilon*1.0e16,那么你可能会说:
1 | return a==b; |
。
我将研究一种不同的方法,它假定您需要担心第一个问题,并假定第二个问题适合您的应用程序。解决方案如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 | #define VERYSMALL (1.0E-150) #define EPSILON (1.0E-8) bool AreSame(double a, double b) { double absDiff = fabs(a - b); if (absDiff < VERYSMALL) { return true; } double maxAbs = max(fabs(a) - fabs(b)); return (absDiff/maxAbs) < EPSILON; } |
。
这在计算上是昂贵的,但有时它是需要的。这是我们在公司必须做的,因为我们处理一个工程库,输入可能会有几十个数量级的变化。
不管怎样,关键是这一点(并适用于几乎所有的编程问题):评估您的需求是什么,然后提出解决方案来满足您的需求——不要假设简单的答案可以满足您的需求。如果在你的评估之后,你发现
我最终花了相当长的时间在这条伟大的线索上浏览材料。我怀疑每个人都想花这么多时间,所以我会强调我学到的东西和我实施的解决方案的总结。
快速总结
实用函数实现(C++ 11)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | //implements relative method - do not use for comparing with zero //use this most of the time, tolerance needs to be meaningful in your context template<typename TReal> static bool isApproximatelyEqual(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon()) { TReal diff = std::fabs(a - b); if (diff <= tolerance) return true; if (diff < std::fmax(std::fabs(a), std::fabs(b)) * tolerance) return true; return false; } //supply tolerance that is meaningful in your context //for example, default tolerance may not work if you are comparing double with float template<typename TReal> static bool isApproximatelyZero(TReal a, TReal tolerance = std::numeric_limits<TReal>::epsilon()) { if (std::fabs(a) <= tolerance) return true; return false; } //use this when you want to be on safe side //for example, don't start rover unless signal is above 1 template<typename TReal> static bool isDefinitelyLessThan(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon()) { TReal diff = a - b; if (diff < tolerance) return true; if (diff < std::fmax(std::fabs(a), std::fabs(b)) * tolerance) return true; return false; } template<typename TReal> static bool isDefinitelyGreaterThan(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon()) { TReal diff = a - b; if (diff > tolerance) return true; if (diff > std::fmax(std::fabs(a), std::fabs(b)) * tolerance) return true; return false; } //implements ULP method //use this when you are only concerned about floating point precision issue //for example, if you want to see if a is 1.0 by checking if its within //10 closest representable floating point numbers around 1.0. template<typename TReal> static bool isWithinPrecisionInterval(TReal a, TReal b, unsigned int interval_size = 1) { TReal min_a = a - (a - std::nextafter(a, std::numeric_limits<TReal>::lowest())) * interval_size; TReal max_a = a + (std::nextafter(a, std::numeric_limits<TReal>::max()) - a) * interval_size; return min_a <= b && max_a >= b; } |
正如其他人所指出的,对于远离epsilon值的值,使用固定指数epsilon(例如0.0000001)将毫无用处。例如,如果您的两个值分别是10000.000977和10000,那么这两个数字之间就没有32位浮点值——10000和10000.000977是尽可能接近的,而不需要逐位相同。这里,小于0.0009的epsilon是没有意义的;您也可以使用直相等运算符。
同样,当两个值的大小接近epsilon时,相对误差会增加到100%。
因此,尝试将固定点号(如0.00001)与浮点值(其中指数是任意的)混合是一个无意义的练习。只有确保操作数值位于一个窄域内(即接近某个特定指数),并为该特定测试正确选择一个epsilon值时,此操作才会有效。如果你从空中拨出一个数字("嘿!0.00001很小,所以一定很好!")你注定要犯错。我花了很多时间调试了一些糟糕的数字代码,其中一些糟糕的schmuck将随机的epsilon值抛出,以使另一个测试用例工作。
如果你做任何类型的数值编程,并且认为你需要达到定点epsilons,阅读布鲁斯关于比较浮点数字的文章。
比较浮点数
qt实现了两个功能,也许您可以从中学习到:
1 2 3 4 5 6 7 8 9 | static inline bool qFuzzyCompare(double p1, double p2) { return (qAbs(p1 - p2) <= 0.000000000001 * qMin(qAbs(p1), qAbs(p2))); } static inline bool qFuzzyCompare(float p1, float p2) { return (qAbs(p1 - p2) <= 0.00001f * qMin(qAbs(p1), qAbs(p2))); } |
您可能需要以下功能,因为
Note that comparing values where either p1 or p2 is 0.0 will not work,
nor does comparing values where one of the values is NaN or infinity.
If one of the values is always 0.0, use qFuzzyIsNull instead. If one
of the values is likely to be 0.0, one solution is to add 1.0 to both
values.
号
1 2 3 4 5 6 7 8 9 | static inline bool qFuzzyIsNull(double d) { return qAbs(d) <= 0.000000000001; } static inline bool qFuzzyIsNull(float f) { return qAbs(f) <= 0.00001f; } |
号
浮点数的一般用途比较通常没有意义。如何进行比较真的取决于手头上的一个问题。在许多问题中,数字被充分离散化,允许在给定的公差范围内进行比较。不幸的是,也有同样多的问题,在这些问题上,这种技巧并不真正奏效。例如,当你的观察值非常接近障碍物时,考虑使用问题数字的重侧(阶跃)函数(考虑到数字股票期权)。进行基于容忍度的比较不会有多大好处,因为它会有效地将问题从最初的障碍转移到两个新的障碍上。同样,对于这些问题没有通用的解决方案,为了达到稳定性,可能需要尽可能地改变数值方法。
不幸的是,即使您的"浪费"代码也是不正确的。epsilon是可以添加到1.0并更改其值的最小值。值1.0非常重要-较大的数字添加到epsilon后不会改变。现在,您可以将这个值缩放到要比较的数字,以判断它们是否不同。比较两个双精度数的正确表达式是:
1 2 3 4 | if (fabs(a - b) <= DBL_EPSILON * fmax(fabs(a), fabs(b))) { // ... } |
这是最低限度的。不过,一般来说,您希望在计算中考虑噪声,忽略一些最不重要的位,因此更实际的比较如下:
1 2 3 4 | if (fabs(a - b) <= 16 * DBL_EPSILON * fmax(fabs(a), fabs(b))) { // ... } |
号
如果比较性能对您非常重要,并且您知道值的范围,那么您应该使用定点数字。
我的课是根据以前公布的答案。与Google的代码非常相似,但我使用了一个偏倚,它将所有NaN值推到0xff000000以上。这样可以更快地检查NaN。
此代码旨在演示概念,而不是一般的解决方案。谷歌的代码已经显示了如何计算所有特定于平台的值,我不想复制所有这些值。我对此代码做了有限的测试。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | typedef unsigned int U32; // Float Memory Bias (unsigned) // ----- ------ --------------- // NaN 0xFFFFFFFF 0xFF800001 // NaN 0xFF800001 0xFFFFFFFF // -Infinity 0xFF800000 0x00000000 --- // -3.40282e+038 0xFF7FFFFF 0x00000001 | // -1.40130e-045 0x80000001 0x7F7FFFFF | // -0.0 0x80000000 0x7F800000 |--- Valid <= 0xFF000000. // 0.0 0x00000000 0x7F800000 | NaN > 0xFF000000 // 1.40130e-045 0x00000001 0x7F800001 | // 3.40282e+038 0x7F7FFFFF 0xFEFFFFFF | // Infinity 0x7F800000 0xFF000000 --- // NaN 0x7F800001 0xFF000001 // NaN 0x7FFFFFFF 0xFF7FFFFF // // Either value of NaN returns false. // -Infinity and +Infinity are not"close". // -0 and +0 are equal. // class CompareFloat{ public: union{ float m_f32; U32 m_u32; }; static bool CompareFloat::IsClose( float A, float B, U32 unitsDelta = 4 ) { U32 a = CompareFloat::GetBiased( A ); U32 b = CompareFloat::GetBiased( B ); if ( (a > 0xFF000000) || (b > 0xFF000000) ) { return( false ); } return( (static_cast<U32>(abs( a - b ))) < unitsDelta ); } protected: static U32 CompareFloat::GetBiased( float f ) { U32 r = ((CompareFloat*)&f)->m_u32; if ( r & 0x80000000 ) { return( ~r - 0x007FFFFF ); } return( r + 0x7F800000 ); } }; |
这里有证据表明,使用
我上述评论的证据:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | #include <stdio.h> #include <limits> double ItoD (__int64 x) { // Return double from 64-bit hexadecimal representation. return *(reinterpret_cast<double*>(&x)); } void test (__int64 ai, __int64 bi) { double a = ItoD(ai), b = ItoD(bi); bool close = std::fabs(a-b) < std::numeric_limits<double>::epsilon(); printf ("%.16f and %.16f %s close. ", a, b, close ?"are" :"are not"); } int main() { test (0x3fe0000000000000L, 0x3fe0000000000001L); test (0x3ff0000000000000L, 0x3ff0000000000001L); } |
运行产生此输出:
1 2 | 0.5000000000000000 and 0.5000000000000001 are close. 1.0000000000000000 and 1.0000000000000002 are not close. |
。
请注意,在第二种情况下(一个且仅大于一个),这两个输入值尽可能接近,但仍会比较为不接近。因此,对于大于1.0的值,您也可以使用相等测试。固定的epsilons在比较浮点值时不会保存您。
这取决于你想要比较的精确程度。如果要比较完全相同的数字,只需使用==。(你几乎不想这样做,除非你真的想要完全相同的号码。)在任何一个体面的平台上,你也可以这样做:
1 | diff= a - b; return fabs(diff)<EPSILON; |
因为
比较双精度数和浮点数的整数技巧很好,但往往使各种CPU管道更难有效处理。现在,在某些按顺序的体系结构上,由于将堆栈用作频繁使用的值的临时存储区域,所以速度肯定不会更快。(为关心的人加载命中存储。)
就工程量表而言:
如果在某种物理意义上,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | #include <limits> #include <iomanip> #include <iostream> #include <cmath> #include <cstdlib> #include <cassert> template< typename A, typename B > inline bool close_enough(A const & a, B const & b, typename std::common_type< A, B >::type const & epsilon) { using std::isless; assert(isless(0, epsilon)); // epsilon is a part of the whole quantity assert(isless(epsilon, 1)); using std::abs; auto const delta = abs(a - b); auto const x = abs(a); auto const y = abs(b); // comparable generally and |a - b| < eps * (|a| + |b|) / 2 return isless(epsilon * y, x) && isless(epsilon * x, y) && isless((delta + delta) / (x + y), epsilon); } int main() { std::cout << std::boolalpha << close_enough(0.9, 1.0, 0.1) << std::endl; std::cout << std::boolalpha << close_enough(1.0, 1.1, 0.1) << std::endl; std::cout << std::boolalpha << close_enough(1.1, 1.2, 0.01) << std::endl; std::cout << std::boolalpha << close_enough(1.0001, 1.0002, 0.01) << std::endl; std::cout << std::boolalpha << close_enough(1.0, 0.01, 0.1) << std::endl; return EXIT_SUCCESS; } |
。
我对任何涉及浮点减法(例如fabs(a-b) 虽然不是可移植的,但我认为格罗姆的回答是避免这些问题的最好办法。
我使用此代码:
1 2 3 4 | bool AlmostEqual(double v1, double v2) { return (std::fabs(v1 - v2) < std::fabs(std::min(v1, v2)) * std::numeric_limits<double>::epsilon()); } |
。
我是为Java编写的,但也许你觉得它很有用。它使用long而不是double,但是处理nan、subnormals等。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | public static boolean equal(double a, double b) { final long fm = 0xFFFFFFFFFFFFFL; // fraction mask final long sm = 0x8000000000000000L; // sign mask final long cm = 0x8000000000000L; // most significant decimal bit mask long c = Double.doubleToLongBits(a), d = Double.doubleToLongBits(b); int ea = (int) (c >> 52 & 2047), eb = (int) (d >> 52 & 2047); if (ea == 2047 && (c & fm) != 0 || eb == 2047 && (d & fm) != 0) return false; // NaN if (c == d) return true; // identical - fast check if (ea == 0 && eb == 0) return true; // ±0 or subnormals if ((c & sm) != (d & sm)) return false; // different signs if (abs(ea - eb) > 1) return false; // b > 2*a or a > 2*b d <<= 12; c <<= 12; if (ea < eb) c = c >> 1 | sm; else if (ea > eb) d = d >> 1 | sm; c -= d; return c < 65536 && c > -65536; // don't use abs(), because: // There is a posibility c=0x8000000000000000 which cannot be converted to positive } public static boolean zero(double a) { return (Double.doubleToLongBits(a) >> 52 & 2047) < 3; } |
。
请记住,在多次浮点运算之后,数字可能与我们期望的非常不同。没有代码可以修复这个问题。
在数字软件中,实际上有一些情况需要检查两个浮点数是否完全相等。我把这个贴在了一个类似的问题上
https://stackoverflow.com/a/10973098/1447411
所以你不能说"Comparedoubles1"一般来说是错误的。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | /// testing whether two doubles are almost equal. We consider two doubles /// equal if the difference is within the range [0, epsilon). /// /// epsilon: a positive number (supposed to be small) /// /// if either x or y is 0, then we are comparing the absolute difference to /// epsilon. /// if both x and y are non-zero, then we are comparing the relative difference /// to epsilon. bool almost_equal(double x, double y, double epsilon) { double diff = x - y; if (x != 0 && y != 0){ diff = diff/y; } if (diff < epsilon && -1.0*diff < epsilon){ return true; } return false; } |
号
我在我的小项目中使用了这个函数,它可以工作,但请注意以下几点:
双精度误差会给你带来惊喜。假设epsilon=1.0e-6,那么根据上面的代码,1.0和1.000001不应该被认为是相等的,但是在我的机器上,函数认为它们是相等的,这是因为1.000001不能精确地转换成二进制格式,它可能是1.0000009xxx。我用1.0和1.0000011测试它,这次我得到了预期的结果。
我的方法可能不正确,但有用
将两个float都转换为字符串,然后进行字符串比较
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | bool IsFlaotEqual(float a, float b, int decimal) { TCHAR form[50] = _T(""); _stprintf(form, _T("%%.%df"), decimal); TCHAR a1[30] = _T(""), a2[30] = _T(""); _stprintf(a1, form, a); _stprintf(a2, form, b); if( _tcscmp(a1, a2) == 0 ) return true; return false; } |
也可以对操作员进行覆盖编码
不能将两个
一个更好的双重比较是:
1 2 3 4 5 | bool same(double a, double b) { return std::nextafter(a, std::numeric_limits<double>::lowest()) <= b && std::nextafter(a, std::numeric_limits<double>::max()) >= b; } |
以更一般的方式:
1 2 3 4 | template <typename T> bool compareNumber(const T& a, const T& b) { return std::abs(a - b) < std::numeric_limits<T>::epsilon(); } |
。
为什么不执行位异或?如果两个浮点数的对应位相等,则两个浮点数相等。我认为,在尾数之前放置指数位的决定是为了加快两个浮点数的比较。我认为,这里的许多答案都缺少epsilon比较的意义。epsilon值仅取决于比较的精度浮点数。例如,对浮点数进行一些算术运算后,您得到两个数字:2.56429435543442和2.5642943554345。它们不相等,但对于解决方案,只有3个十进制数字才重要,因此它们相等:2.564和2.564。在这种情况下,选择epsilon等于0.001。epsilon比较也可以与位异或比较。如果我错了就纠正我。