内存级别并行性 (MLP) 测量

问题描述 投票:0回答:1

我需要找到内存级别并行性(MLP)或未命中状态处理寄存器(MSHR)同时持有的内存请求数量/在 C/C++ 程序执行期间填充每个缓存级别的缓冲区。

我找到了一个链接,它解释了使用不同的性能计数器事件来测量 L1 和 L2 MSHR。

我正在寻找 C/C++ 代码。

performance caching x86 performancecounter
1个回答
0
投票

首先,我们需要了解它的挑战和潜在的方法:

挑战

  • 有限的 C/C++ 访问:C/C++ 不提供对硬件的直接访问 用于缓存相关事件(例如 MSHR)的性能计数器 (HPC) 入住率。
  • 架构依赖:HPC 事件及其 不同处理器架构的解释差异很大 (例如英特尔、AMD、ARM)。

潜在方法:

  • 特定于平台的组装或内联组装(如果适用):
  • 特定于操作系统的 API(如果可用)
  • 特定于编译器的内部函数(有限 使用)

MLP 估计的替代方法

缓存行大小:您可以通过将程序的数据占用空间除以目标架构的缓存行大小来估计 MLP。这提供了粗略的估计,但不能直接测量 MSHR 占用率。

数据访问模式:分析程序的内存访问模式可以深入了解潜在的 MLP。如果访问表现出良好的局部性,MLP 可能会较低。

这里有一个模板(假设 Linux 和 perf 库)来说明这个概念,但请记住根据您的特定环境进行调整的限制和潜在需求:

#include <iostream>
#include <vector>
#include <string>
#include <unistd.h>  // For Linux-specific functions (replace with OS-specific headers if needed)

// Replace with the actual header file(s) for your target architecture and OS
// (e.g., perfhw.h for Linux with the perf library, vendor-specific headers for other OSes)
#include "your_platform_specific_performance_counter_header.h"

// Function to initialize performance counter attributes (replace with actual initialization)
void init_perf_event_attr(perf_event_attr *attr, const std::string &event_name) {
  // ... (Replace with platform-specific initialization logic based on event_name)
  // This should set the appropriate event code and other attributes for the desired event
  std::cerr << "Warning: Performance counter initialization not implemented for this platform.\n";
}

// Function to create performance counter file descriptors (replace with actual creation)
int create_perf_event_fd(const perf_event_attr *attr) {
  // ... (Replace with platform-specific logic to create a file descriptor for the event)
  // This should use the perf_event_open or equivalent function for your OS
  std::cerr << "Warning: Performance counter file descriptor creation not implemented for this platform.\n";
  return -1;
}

// Function to read performance counter values (replace with actual reading)
long long read_perf_counter(int fd) {
  // ... (Replace with platform-specific logic to read the counter value from the file descriptor)
  // This should use read or equivalent function for your OS
  std::cerr << "Warning: Performance counter value reading not implemented for this platform.\n";
  return -1;
}

// Function to measure cache events using performance counters (template)
int measure_cache_events(const std::vector<std::string> &event_names) {
  std::vector<perf_event_attr> attrs(event_names.size());
  std::vector<int> fds(event_names.size());

  // Initialize performance counter attributes
  for (size_t i = 0; i < event_names.size(); ++i) {
    init_perf_event_attr(&attrs[i], event_names[i]);
  }

  // Create performance counter file descriptors
  for (size_t i = 0; i < event_names.size(); ++i) {
    fds[i] = create_perf_event_fd(&attrs[i]);
    if (fds[i] < 0) {
      return -1;
    }
  }

  // Simulate some workload to generate cache events
  // (Replace this with your actual program execution logic)
  int data[1024];  // Simulate some data access
  for (int i = 0; i < 1024; ++i) {
    data[i] = i;
  }

  // Read and report performance counter values
  std::vector<long long> counts(event_names.size());
  for (size_t i = 0; i < event_names.size(); ++i) {
    counts[i] = read_perf_counter(fds[i]);
    if (counts[i] < 0) {
      return -1;
    }
    std::cout << "Event " << event_names[i] << ": " << counts[i] << std::endl;
  }

  // Close performance counter file descriptors
  for (int fd : fds) {
    close(fd);
  }

  return 0;
}

int main() {
  // Define events you want to measure (replace with appropriate event codes/names for your platform)
  // **Important:** Research events related to L1/
}
© www.soinside.com 2019 - 2024. All rights reserved.