简体   繁体   English

使用mmap和/ proc / mtrr访问不​​可访问的区域

[英]Accessing uncachable region using mmap and /proc/mtrr

I am playing around with mmap and /proc/mtrr in an effort to do some in-depth analysis on physical memory analysis. 我正在玩mmap和/ proc / mtrr,以便对物理内存分析进行深入分析。 Here is the basic idea of what I am trying to do and a summary of what I have done so far. 这是我想要做的基本想法,以及到目前为止我所做的总结。 I am on Ubuntu kernel version 3.5.0-54-generic. 我在Ubuntu内核版本3.5.0-54-generic上。

I am basically mmapping to a specific physical address (using hints from /proc/iomem) and measuring access latency to this physical address range. 我基本上是映射到特定的物理地址(使用来自/ proc / iomem的提示)并测量此物理地址范围的访问延迟。 Here is what I have done so far: 这是我到目前为止所做的:

  1. Created an entry in the /proc/mtrr to make the physical address range that I will be mmapping uncachable. 在/ proc / mtrr中创建了一个条目,以使我将要映射的物理地址范围无法访问。
  2. mmaped to the specific address using /dev/mem. 使用/ dev / mem mmaped到特定地址。 I had to relax the security restrictions in order to read more than 1 MB from /dev/mem. 我不得不放松安全限制,以便从/ dev / mem读取超过1 MB的空间。

While, I am able to execute the program with no issues, I have some doubts about whether the uncachable part actually works. 虽然,我能够毫无问题地执行该程序,但我对于无法访问的部分是否真正有效存在疑问。 Here is a snippet of code I am using. 这是我正在使用的代码片段。 Note that I have used a pseudocode from a prior research paper to create this code. 请注意,我使用了先前研究论文中的伪代码来创建此代码。

  int main(int argc, char *argv[]) {  
    int fd; // file descriptor to open /dev/mem
    struct timespec time1, time2;
    fd = open("/dev/mem", O_RDWR|O_SYNC);
    if (fd == -1) {
        printf("\n Error opening /dev/mem");
        return 0;
    }
    struct timespec t1, t2;
    char *addr = (char*)mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0x20000);
    if (addr == MAP_FAILED) {
      printf("\n mmap() failed");
    } 
    // Begin accessing 
    char *addr1 = addr;
    char *addr2 = addr1 + 64; // add cache line

    unsigned int i = 0;
    unsigned int j = 0;
    // Begin accessing uncached region
    while(j < 8192){
        i = 0;
        while(i < 500) {
            *addr1 = *addr2 + i;
            *addr2 = *addr1 + i;
            i = i+1;
        }
        j = j + 64;
        addr2 = addr1 + j;
    }
    if (munmap(addr, 8192) == -1) {
         printf("\n Unmapping failed");
         return 0;
    }
    printf("\n Success......");
    return 0;
}

I use the offset 0x20000 based on the output /proc/iomem as shown below (showing only relevant info): 我使用基于输出/ proc / iomem的偏移量0x20000,如下所示(仅显示相关信息):

00000000-0000ffff : reserved
**00010000-0009e3ff : System RAM**
0009e400-0009ffff : RAM buffer
000a0000-000bffff : PCI Bus 0000:00
000a0000-000b0000 : PCI Bus 0000:20
000c0000-000effff : PCI Bus 0000:00

The following are the entries in /proc/mtrr: 以下是/ proc / mtrr中的条目:

reg00: base=0x0d3f00000 ( 3391MB), size=    1MB, count=1: uncachable
reg01: base=0x0d4000000 ( 3392MB), size=   64MB, count=1: uncachable
reg02: base=0x0d8000000 ( 3456MB), size=  128MB, count=1: uncachable
reg03: base=0x0e0000000 ( 3584MB), size=  512MB, count=1: uncachable
reg04: base=0x000020000 (    0MB), size=    8KB, count=1: uncachable

As you can see, the final entry makes the interested address region uncachable. 如您所见,最终条目使感兴趣的地址区域无法访问。

While I have no problems running the code, I have the following concerns: 虽然运行代码没有问题,但我有以下问题:

  1. Is it correct to pick that particular physical address range denoted as System RAM to do read/write? 选择表示为系统RAM的特定物理地址范围来进行读/写操作是否正确? My understanding is that that address range is used to store data and code. 我的理解是该地址范围用于存储数据和代码。 In addition from reading /dev/mem using hexdump, I observe that the address region is uninitialized (set to 0). 除了使用hexdump读取/ dev / mem之外,我发现地址区域是未初始化的(设置为0)。
  2. To check if the accesses to the uncached region are actually uncached, I do a perf stat -e cache-misses:u to measure how many cache misses occurs. 要检查对未缓存区域的访问是否实际未缓存,我会执行perf stat -e cache-miss:u来测量发生的缓存未命中数。 I get a number in the range of 128,200. 我得到的数字在128,200之间。 To me this confirms that the addresses are not cached and are going to RAM as in the loop, I am doing (8192/64)*500*2 = 128,000 accesses. 对我来说这证实了地址没有被缓存并且像循环一样进入RAM,我正在做(8192/64)* 500 * 2 = 128,000次访问。 I did the same perf exercise with another similar piece of code with the mmap replaced with a dynamic memory allocation of a character array of the same length. 我使用另一个类似的代码进行了相同的perf练习,mmap替换为相同长度的字符数组的动态内存分配。 In this case perf stat reported far far less cache misses. 在这种情况下,perf stat报告远远少于缓存未命中。
  3. To re-check if I am indeed bypassing the cache and going to the memory, I change the offset to another value within the System RAM range (say 0x80000) and ran the perf command to measure how many cache misses occur. 要重新检查我是否确实绕过缓存并转到内存,我将偏移更改为系统RAM范围内的另一个值(例如0x80000)并运行perf命令来测量发生的缓存未命中数。 The confusion here is that it reports back almost the same number of cache misses in the previous case (around 128,200). 这里的混乱是它报告了前一种情况下大约相同数量的缓存未命中(大约128,200)。 I would expect something much less as I have not made that physical address region uncachable. 我期望更少的东西,因为我没有使该物理地址区域无法访问。

Any suggestions/feedback on this to understand this observation would be helpful. 对此有任何建议/反馈,以了解这一观察结果会有所帮助。

Thanks 谢谢

I think I figured it out. 我想我明白了。 MAP_PRIVATE from the man pages says that the changes are not reflected to the underlying file. 手册页中的MAP_PRIVATE表示更改未反映到基础文件中。 On changing it to MAP_SHARED, and enabling the entry in /proc/mtrr, the change in the number of cache misses and hits change significantly. 在将其更改为MAP_SHARED并启用/ proc / mtrr中的条目时,缓存未命中数和命中数的变化会发生显着变化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM