名稱空間
變體
操作

std::hardware_destructive_interference_size, std::hardware_constructive_interference_size

來自 cppreference.com
< cpp‎ | thread
 
 
併發支援庫
執行緒
(C++11)
(C++20)
hardware_destructive_interference_sizehardware_constructive_interference_size
(C++17)(C++17)
this_thread 名稱空間
(C++11)
(C++11)
(C++11)
協同取消
互斥
(C++11)
通用鎖管理
(C++11)
(C++11)
(C++11)
(C++11)
(C++11)
條件變數
(C++11)
訊號量
門閂和屏障
(C++20)
(C++20)
期值
(C++11)
(C++11)
(C++11)
(C++11)
安全回收
(C++26)
危險指標
原子型別
(C++11)
(C++20)
原子型別的初始化
(C++11)(C++20 中已棄用)
(C++11)(C++20 中已棄用)
記憶體排序
(C++11)(C++26 中已棄用)
原子操作的自由函式
原子標誌的自由函式
 
定義於標頭檔案 <new>
inline constexpr std::size_t
    hardware_destructive_interference_size = /*實現定義*/;
(1) (C++17 起)
inline constexpr std::size_t
    hardware_constructive_interference_size = /*實現定義*/;
(2) (C++17 起)
1) 為避免偽共享,兩個物件之間的最小偏移量。保證至少為 alignof(std::max_align_t)
struct keep_apart
{
    alignas(std::hardware_destructive_interference_size) std::atomic<int> cat;
    alignas(std::hardware_destructive_interference_size) std::atomic<int> dog;
};
2) 為促進真共享,連續記憶體的最大大小。保證至少為 alignof(std::max_align_t)
struct together
{
    std::atomic<int> dog;
    int puppy;
};
 
struct kennel
{
    // Other data members...
 
    alignas(sizeof(together)) together pack;
 
    // Other data members...
};
 
static_assert(sizeof(together) <= std::hardware_constructive_interference_size);

[編輯] 注意

這些常量提供了一種可移植的方式來訪問 L1 資料快取行大小。

特性測試 標準 特性
__cpp_lib_hardware_interference_size 201703L (C++17) constexpr std::hardware_constructive_interference_size

constexpr std::hardware_destructive_interference_size

[編輯] 示例

該程式使用兩個執行緒原子地寫入給定全域性物件的資料成員。第一個物件適合一個快取行,這會導致“硬體干擾”。第二個物件將其資料成員儲存在單獨的快取行上,因此避免了執行緒寫入後可能出現的“快取同步”。

#include <atomic>
#include <chrono>
#include <cstddef>
#include <iomanip>
#include <iostream>
#include <mutex>
#include <new>
#include <thread>
 
#ifdef __cpp_lib_hardware_interference_size
    using std::hardware_constructive_interference_size;
    using std::hardware_destructive_interference_size;
#else
    // 64 bytes on x86-64 │ L1_CACHE_BYTES │ L1_CACHE_SHIFT │ __cacheline_aligned │ ...
    constexpr std::size_t hardware_constructive_interference_size = 64;
    constexpr std::size_t hardware_destructive_interference_size = 64;
#endif
 
std::mutex cout_mutex;
 
constexpr int max_write_iterations{10'000'000}; // the benchmark time tuning
 
struct alignas(hardware_constructive_interference_size)
OneCacheLiner // occupies one cache line
{
    std::atomic_uint64_t x{};
    std::atomic_uint64_t y{};
}
oneCacheLiner;
 
struct TwoCacheLiner // occupies two cache lines
{
    alignas(hardware_destructive_interference_size) std::atomic_uint64_t x{};
    alignas(hardware_destructive_interference_size) std::atomic_uint64_t y{};
}
twoCacheLiner;
 
inline auto now() noexcept { return std::chrono::high_resolution_clock::now(); }
 
template<bool xy>
void oneCacheLinerThread()
{
    const auto start{now()};
 
    for (uint64_t count{}; count != max_write_iterations; ++count)
        if constexpr (xy)
            oneCacheLiner.x.fetch_add(1, std::memory_order_relaxed);
        else
            oneCacheLiner.y.fetch_add(1, std::memory_order_relaxed);
 
    const std::chrono::duration<double, std::milli> elapsed{now() - start};
    std::lock_guard lk{cout_mutex};
    std::cout << "oneCacheLinerThread() spent " << elapsed.count() << " ms\n";
    if constexpr (xy)
        oneCacheLiner.x = elapsed.count();
    else
        oneCacheLiner.y = elapsed.count();
}
 
template<bool xy>
void twoCacheLinerThread()
{
    const auto start{now()};
 
    for (uint64_t count{}; count != max_write_iterations; ++count)
        if constexpr (xy)
            twoCacheLiner.x.fetch_add(1, std::memory_order_relaxed);
        else
            twoCacheLiner.y.fetch_add(1, std::memory_order_relaxed);
 
    const std::chrono::duration<double, std::milli> elapsed{now() - start};
    std::lock_guard lk{cout_mutex};
    std::cout << "twoCacheLinerThread() spent " << elapsed.count() << " ms\n";
    if constexpr (xy)
        twoCacheLiner.x = elapsed.count();
    else
        twoCacheLiner.y = elapsed.count();
}
 
int main()
{
    std::cout << "__cpp_lib_hardware_interference_size "
#   ifdef __cpp_lib_hardware_interference_size
        "= " << __cpp_lib_hardware_interference_size << '\n';
#   else
        "is not defined, use " << hardware_destructive_interference_size
                               << " as fallback\n";
#   endif
 
    std::cout << "hardware_destructive_interference_size == "
              << hardware_destructive_interference_size << '\n'
              << "hardware_constructive_interference_size == "
              << hardware_constructive_interference_size << "\n\n"
              << std::fixed << std::setprecision(2)
              << "sizeof( OneCacheLiner ) == " << sizeof(OneCacheLiner) << '\n'
              << "sizeof( TwoCacheLiner ) == " << sizeof(TwoCacheLiner) << "\n\n";
 
    constexpr int max_runs{4};
 
    int oneCacheLiner_average{0};
    for (auto i{0}; i != max_runs; ++i)
    {
        std::thread th1{oneCacheLinerThread<0>};
        std::thread th2{oneCacheLinerThread<1>};
        th1.join();
        th2.join();
        oneCacheLiner_average += oneCacheLiner.x + oneCacheLiner.y;
    }
    std::cout << "Average T1 time: "
              << (oneCacheLiner_average / max_runs / 2) << " ms\n\n";
 
    int twoCacheLiner_average{0};
    for (auto i{0}; i != max_runs; ++i)
    {
        std::thread th1{twoCacheLinerThread<0>};
        std::thread th2{twoCacheLinerThread<1>};
        th1.join();
        th2.join();
        twoCacheLiner_average += twoCacheLiner.x + twoCacheLiner.y;
    }
    std::cout << "Average T2 time: "
              << (twoCacheLiner_average / max_runs / 2) << " ms\n\n"
              << "Ratio T1/T2:~ "
              << 1.0 * oneCacheLiner_average / twoCacheLiner_average << '\n';
}

可能的輸出

__cpp_lib_hardware_interference_size = 201703
hardware_destructive_interference_size == 64
hardware_constructive_interference_size == 64
 
sizeof( OneCacheLiner ) == 64
sizeof( TwoCacheLiner ) == 128
 
oneCacheLinerThread() spent 517.83 ms
oneCacheLinerThread() spent 533.43 ms
oneCacheLinerThread() spent 527.36 ms
oneCacheLinerThread() spent 555.69 ms
oneCacheLinerThread() spent 574.74 ms
oneCacheLinerThread() spent 591.66 ms
oneCacheLinerThread() spent 555.63 ms
oneCacheLinerThread() spent 555.76 ms
Average T1 time: 550 ms
 
twoCacheLinerThread() spent 89.79 ms
twoCacheLinerThread() spent 89.94 ms
twoCacheLinerThread() spent 89.46 ms
twoCacheLinerThread() spent 90.28 ms
twoCacheLinerThread() spent 89.73 ms
twoCacheLinerThread() spent 91.11 ms
twoCacheLinerThread() spent 89.17 ms
twoCacheLinerThread() spent 90.09 ms
Average T2 time: 89 ms
 
Ratio T1/T2:~ 6.16

[編輯] 參閱

返回實現支援的併發執行緒數
(std::thread 的公共靜態成員函式) [編輯]
返回實現支援的併發執行緒數
(std::jthread 的公共靜態成員函式) [編輯]