Is my Double-Checked Locking Pattern implementation right?
Meyers 的《Effective Modern C》一书中的一个例子,第 16 条。
in a class caching an expensive-to-compute int, you might try to use a
pair of std::atomic avriables instead of a mutex:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | class Widget { public: int magicValue() const { if (cachedValid) { return cachedValue; } else { auto val1 = expensiveComputation1(); auto val2 = expensiveComputation2(); cachedValue = va1 + val2; cacheValid = true; return cachedValue; } } private: mutable std::atomic<bool> cacheValid { false }; mutable std::atomic<int> cachedValue; }; |
This will work, but sometimes it will work a lot harder than it
should.Consider: A thread calls Widget::magicValue, sees cacheValid as
false, performs the two expensive computations, and assigns their sum
to cachedValud. At that point, a second thread calss
Widget::magicValue, also sees cacheValid as false, and thus carries
out the same expensive computations that the first thread has just
finished.
然后他给出了一个使用互斥锁的解决方案:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | class Widget { public: int magicValue() const { std::lock_guard<std::mutex> guard(m); if (cacheValid) { return cachedValue; } else { auto val1 = expensiveComputation1(); auto val2 = expensiveComputation2(); cachedValue = va1 + val2; cacheValid = true; return cachedValue; } } private: mutable std::mutex m; mutable bool cacheValid { false }; mutable int cachedValue; }; |
但我认为解决方案不是那么有效,我考虑将互斥锁和原子结合起来组成一个双重检查锁定模式,如下所示。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | class Widget { public: int magicValue() const { if (!cacheValid) { std::lock_guard<std::mutex> guard(m); if (!cacheValid) { auto val1 = expensiveComputation1(); auto val2 = expensiveComputation2(); cachedValue = va1 + val2; cacheValid = true; } } return cachedValue; } private: mutable std::mutex m; mutable std::atomic<bool> cacheValid { false }; mutable std::atomic<int> cachedValue; }; |
因为我是多线程编程的新手,所以想了解一下:
- 我的代码对吗?
- 它的性能更好吗?
编辑:
修复了代码。if (!cachedValue) -> if (!cacheValid)
正如HappyCactus所指出的,第二个检查
Is my code right?
是的。您应用的双重检查锁定模式是正确的。但请参阅下面的一些改进。
Does it performance better ?
与完全锁定的变体(您的帖子中的第二个)相比,它的性能大多更好,直到
与无锁变体(您的帖子中的第一个)相比,您的代码表现出更好的性能,直到值计算比等待互斥锁更快。
例如,10 个值的总和(通常)比等待互斥锁要快。在这种情况下,第一个变体是可取的。另一方面,从文件中读取 10 次比等待互斥体慢,所以你的变体比第一次好。
实际上,对您的代码有一些简单的改进,可以使其更快(至少在某些机器上)并提高对代码的理解:
此外,如该答案 https://stackoverflow.com/a/30049946/3440745 中所述,当访问
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | class Widget { public: int magicValue() const { //'Acquire' semantic when read flag. if (!cacheValid.load(std::memory_order_acquire)) { std::lock_guard<std::mutex> guard(m); // Reading flag under mutex locked doesn't require any memory order. if (!cacheValid.load(std::memory_order_relaxed)) { auto val1 = expensiveComputation1(); auto val2 = expensiveComputation2(); cachedValue = va1 + val2; // 'Release' semantic when write flag cacheValid.store(true, std::memory_order_release); } } return cachedValue; } private: mutable std::mutex m; mutable std::atomic<bool> cacheValid { false }; mutable int cachedValue; // Atomic isn't needed here. }; |
您可以通过降低内存排序要求来稍微提高解决方案的效率。这里不需要原子操作的默认顺序一致性内存顺序。
性能差异在 x86 上可能可以忽略不计,但在 ARM 上很明显,因为顺序一致性内存顺序在 ARM 上很昂贵。有关详细信息,请参阅 Herb Sutter 的"强"和"弱"硬件内存模型。
建议更改:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | class Widget { public: int magicValue() const { if (cachedValid.load(std::memory_order_acquire)) { // Acquire semantics. return cachedValue; } else { auto val1 = expensiveComputation1(); auto val2 = expensiveComputation2(); cachedValue = va1 + val2; // Non-atomic write. // Release semantics. // Prevents compiler and CPU store reordering. // Makes this and preceding stores by this thread visible to other threads. cachedValid.store(true, std::memory_order_release); return cachedValue; } } private: mutable std::atomic<bool> cacheValid { false }; mutable int cachedValue; // Non-atomic. }; |
不正确:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | int magicValue() const { if (!cachedValid) { // this part is unprotected, what if a second thread evaluates // the previous test when this first is here? it behaves // exactly like in the first example. std::lock_guard<std::mutex> guard(m); if (!cachedValue) { auto val1 = expensiveComputation1(); auto val2 = expensiveComputation2(); cachedValue = va1 + val2; cachedValid = true; } } return cachedValue; |