简体   繁体   English

QEMU pcie_host 如何将物理地址转换为 pcie 地址

[英]QEMU how pcie_host converts physical address to pcie address

I am learning the implementations of QEMU.我正在学习 QEMU 的实现。 Here I got a question: As we know, in real hardware, when cpu reads the virtual address which is the address of pci devices, pci host will take the responsibility to convert it to address of pci.这里有一个问题:众所周知,在实际硬件中,当cpu读取pci设备地址的虚拟地址时,pci host会负责将其转换为pci的地址。 And QEMU, provides pcie_host.c to imitate pcie host.而QEMU,提供了pcie_host.c来模拟pcie主机。 In this file, pcie_mmcfg_data_write is implemented, but nothing about the conversion of physical address to pci address.在这个文件中实现了pcie_mmcfg_data_write ,但是没有关于物理地址到pci地址的转换。

I do a test in QEMU using gdb:我使用 gdb 在 QEMU 中进行了测试:

  • firstly, I add edu device , which is a very simple pci device, into qemu.首先,我在 qemu 中添加了edu device ,这是一个非常简单的 pci 设备。
  • When I try to open Memory Space Enable, (Mem- to Mem+): septic -s 00:02.0 04.b=2 , qemu stop in function pcie_mmcfg_data_write .当我尝试打开 Memory Space Enable, (Mem- to Mem+): septic -s 00:02.0 04.b=2时,qemu 停止在 function pcie_mmcfg_data_write
static void pcie_mmcfg_data_write(void *opaque, hwaddr mmcfg_addr,
                                  uint64_t val, unsigned len)
{
    PCIExpressHost *e = opaque;
    PCIBus *s = e->pci.bus;
    PCIDevice *pci_dev = pcie_dev_find_by_mmcfg_addr(s, mmcfg_addr);
    uint32_t addr;
    uint32_t limit;

    if (!pci_dev) {
        return;
    }
    addr = PCIE_MMCFG_CONFOFFSET(mmcfg_addr);
    limit = pci_config_size(pci_dev);
    pci_host_config_write_common(pci_dev, addr, limit, val, len);
}

It is obvious that pcie host uses this function to find device and do the thing.很明显,pcie 主机使用这个 function 来查找设备并执行此操作。 Use bt can get:使用bt可以得到:

#0  pcie_mmcfg_data_write
    (opaque=0xaaaaac573f10, mmcfg_addr=65540, val=2, len=1)
    at hw/pci/pcie_host.c:39
#1  0x0000aaaaaae4e8a8 in memory_region_write_accessor
    (mr=0xaaaaac574520, addr=65540, value=0xffffe14703e8, size=1, shift=0, mask=255, attrs=...) 
    at /home/mrzleo/Desktop/qemu/memory.c:483
#2  0x0000aaaaaae4eb14 in access_with_adjusted_size
    (addr=65540, value=0xffffe14703e8, size=1, access_size_min=1, access_size_max=4, access_fn=
    0xaaaaaae4e7c0 <memory_region_write_accessor>, mr=0xaaaaac574520, attrs=...) at /home/mrzleo/Desktop/qemu/memory.c:544
#3  0x0000aaaaaae51898 in memory_region_dispatch_write
    (mr=0xaaaaac574520, addr=65540, data=2, op=MO_8, attrs=...)
    at /home/mrzleo/Desktop/qemu/memory.c:1465
#4  0x0000aaaaaae72410 in io_writex
    (env=0xaaaaac6924e0, iotlbentry=0xffff000e9b00, mmu_idx=2, val=2, 
    addr=18446603336758132740, retaddr=281473269319356, op=MO_8)
    at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1084
#5  0x0000aaaaaae74854 in store_helper
    (env=0xaaaaac6924e0, addr=18446603336758132740, val=2, oi=2, retaddr=281473269319356, op=MO_8) 
    at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1954
#6  0x0000aaaaaae74d78 in helper_ret_stb_mmu
    (env=0xaaaaac6924e0, addr=18446603336758132740, val=2 '\002', oi=2, retaddr=281473269319356) 
    at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:2056
#7  0x0000ffff9a3b47cc in code_gen_buffer ()
#8  0x0000aaaaaae8d484 in cpu_tb_exec
    (cpu=0xaaaaac688c00, itb=0xffff945691c0 <code_gen_buffer+5673332>)
    at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:172
#9  0x0000aaaaaae8e4ec in cpu_loop_exec_tb
    (cpu=0xaaaaac688c00, tb=0xffff945691c0 <code_gen_buffer+5673332>, 
    last_tb=0xffffe1470b78, tb_exit=0xffffe1470b70)
    at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:619
#10 0x0000aaaaaae8e830 in cpu_exec (cpu=0xaaaaac688c00)
    at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:732
#11 0x0000aaaaaae3d43c in tcg_cpu_exec (cpu=0xaaaaac688c00)
    at /home/mrzleo/Desktop/qemu/cpus.c:1405
#12 0x0000aaaaaae3dd4c in qemu_tcg_cpu_thread_fn (arg=0xaaaaac688c00)
    at /home/mrzleo/Desktop/qemu/cpus.c:1713
#13 0x0000aaaaab722c70 in qemu_thread_start (args=0xaaaaac715be0)
    at util/qemu-thread-posix.c:519
#14 0x0000fffff5af84fc in start_thread (arg=0xffffffffe3ff)
    at pthread_create.c:477
#15 0x0000fffff5a5167c in thread_start ()
    at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
  • and I try to visit the address of edu : devmem 0x10000000 qemu stop in edu_mmio_read .我尝试访问edu的地址: devmem 0x10000000 qemu stop in edu_mmio_read use bt :使用bt
(gdb) bt
#0  edu_mmio_read 
    (opaque=0xaaaaae71c560, addr=0, size=4) 
        at hw/misc/edu.c:187
#1  0x0000aaaaaae4e5b4 in memory_region_read_accessor
    (mr=0xaaaaae71ce50, addr=0, value=0xffffe2472438, size=4, shift=0, mask=4294967295, attrs=...)
    at /home/mrzleo/Desktop/qemu/memory.c:434
#2  0x0000aaaaaae4eb14 in access_with_adjusted_size
    (addr=0, value=0xffffe2472438, size=4, access_size_min=4, access_size_max=8, access_fn=
    0xaaaaaae4e570 <memory_region_read_accessor>, mr=0xaaaaae71ce50, attrs=...) 
    at /home/mrzleo/Desktop/qemu/memory.c:544
#3  0x0000aaaaaae51524 in memory_region_dispatch_read1 
(mr=0xaaaaae71ce50, addr=0, pval=0xffffe2472438, size=4, attrs=...)
    at /home/mrzleo/Desktop/qemu/memory.c:1385
#4  0x0000aaaaaae51600 in memory_region_dispatch_read 
(mr=0xaaaaae71ce50, addr=0, pval=0xffffe2472438, op=MO_32, attrs=...)
    at /home/mrzleo/Desktop/qemu/memory.c:1413
#5  0x0000aaaaaae72218 in io_readx
    (env=0xaaaaac6be0f0, iotlbentry=0xffff04282ec0, mmu_idx=0, 
    addr=281472901758976, retaddr=281473196263360, access_type=MMU_DATA_LOAD, op=MO_32) 
    at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1045
#6  0x0000aaaaaae738b0 in load_helper
    (env=0xaaaaac6be0f0, addr=281472901758976, oi=32, retaddr=281473196263360, 
    op=MO_32, code_read=false, full_load=0xaaaaaae73c68 <full_le_ldul_mmu>) 
    at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1566
#7  0x0000aaaaaae73ca4 in full_le_ldul_mmu 
(env=0xaaaaac6be0f0, addr=281472901758976, oi=32, retaddr=281473196263360)
    at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1662
#8  0x0000aaaaaae73cd8 in helper_le_ldul_mmu 
(env=0xaaaaac6be0f0, addr=281472901758976, oi=32, retaddr=281473196263360)
    at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1669
#9  0x0000ffff95e08824 in code_gen_buffer 
()
#10 0x0000aaaaaae8d484 in cpu_tb_exec 
(cpu=0xaaaaac6b4810, itb=0xffff95e086c0 <code_gen_buffer+31491700>)
    at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:172
#11 0x0000aaaaaae8e4ec in cpu_loop_exec_tb
    (cpu=0xaaaaac6b4810, tb=0xffff95e086c0 <code_gen_buffer+31491700>, 
    last_tb=0xffffe2472b78, tb_exit=0xffffe2472b70)
    at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:619
#12 0x0000aaaaaae8e830 in cpu_exec 
(cpu=0xaaaaac6b4810) at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:732
#13 0x0000aaaaaae3d43c in tcg_cpu_exec 
(cpu=0xaaaaac6b4810) at /home/mrzleo/Desktop/qemu/cpus.c:1405
#14 0x0000aaaaaae3dd4c in qemu_tcg_cpu_thread_fn 
(arg=0xaaaaac6b4810) 
    at /home/mrzleo/Desktop/qemu/cpus.c:1713
#15 0x0000aaaaab722c70 in qemu_thread_start (args=0xaaaaac541610) at util/qemu-thread-posix.c:519
#16 0x0000fffff5af84fc in start_thread (arg=0xffffffffe36f) at pthread_create.c:477
#17 0x0000fffff5a5167c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

It seems that qemu just locates to edu device directly, and pcie host do nothing in this procedure.看来qemu只是直接定位到edu设备,而pcie主机在这个过程中什么也不做。 I wonder whether qemu do not implements the conversion here and just use memoryRegion to achieve polymorphism?不知道qemu这里是不是没有实现转换,只是用memoryRegion来实现多态? If not, how QEMU's pcie host do in this procedure?如果不是,QEMU 的 pcie 主机在这个过程中是怎么做的?

QEMU uses a set of data structures called MemoryRegions to model the address space that a CPU sees (the detailed API is documented in part in the developer docs ). QEMU 使用一组称为 MemoryRegions 的数据结构来 model CPU 看到的地址空间(详细的 API部分记录在开发人员文档中)。

MemoryRegions can be built up into a tree, where at the "root" there is one 'container' MR which covers the whole 64-bit address space the guest CPU can see, and then MRs for blocks of RAM, devices, etc are placed into that root MR at appropriate offsets. MemoryRegions 可以构建成一棵树,在“根”处有一个“容器”MR,它覆盖客户 CPU 可以看到的整个 64 位地址空间,然后放置 RAM、设备等块的 MR以适当的偏移量进入该根 MR。 Child MRs can also be containers which in turn contain further MRs.子 MR 也可以是容器,容器又包含更多的 MR。 You can then find the MR corresponding to a given guest physical address by walking through the tree of MRs.然后,您可以通过遍历 MR 树找到与给定访客物理地址对应的 MR。

The tree of MemoryRegions is largely built up statically when QEMU starts (because most devices don't move around), but it can also be changed dynamically in response to guest software actions. MemoryRegions 树在 QEMU 启动时大部分是静态构建的(因为大多数设备不会四处移动),但它也可以响应客户软件操作而动态更改。 In particular, PCI works this way.特别是,PCI 以这种方式工作。 When the guest OS writes to a PCI device BAR (which is in PCI config space) this causes QEMU's PCI host controller emulation code to place the MR corresponding to the device's registers into the MemoryRegion hierarchy at the correct place and offset (depending on what address the guest wrote to the BAR, ie where it asked for it to be mapped).当客户操作系统写入 PCI 设备 BAR(位于 PCI 配置空间中)时,这会导致 QEMU 的 PCI 主机 controller 仿真代码将对应于设备寄存器的 MR 放入 MemoryRegion 层次结构中的正确位置和偏移量(取决于地址客人写信给 BAR,即它要求映射的地方)。 Once this is done, the MR for the PCI device is like any other in the tree, and the PCI host controller code doesn't need to be involved in guest accesses to it.完成此操作后,PCI 设备的 MR 就像树中的任何其他设备一样,并且 PCI 主机 controller 代码不需要参与访客对其的访问。

As a performance optimisation, QEMU doesn't actually walk down a tree of MRs for every access.作为一种性能优化,QEMU 实际上并没有为每次访问遍历一棵 MR 树。 Instead, we first "flatten" the tree into a data structure (a FlatView) that directly says "for this range of addresses, it will be this MR; for this range; this MR", and so on.相反,我们首先将树“展平”成一个数据结构(一个 FlatView),它直接表示“对于这个地址范围,它将是这个 MR;对于这个范围;这个 MR”,等等。 Secondly, QEMU's TLB structure can directly cache mappings from "guest virtual address" to "specific memory region".其次,QEMU的TLB结构可以直接缓存“guest virtual address”到“specific memory region”的映射。 On first access it will do an emulated guest MMU page table walk to get from the guest virtual address to the guest physical address, and then it will look that physical address up in the FlatView to find either the real host RAM or the MemoryRegion that is mapped there, and it will add the "guest VA -> this MR" mapping to the TLB cache.在第一次访问时,它将执行一个模拟的客户 MMU 页表遍历,以从客户虚拟地址到客户物理地址,然后它会在 FlatView 中查找该物理地址以找到真正的主机 RAM 或 MemoryRegion映射到那里,它会将“guest VA -> this MR”映射添加到 TLB 缓存。 Future accesses will hit in the TLB and need not repeat the work of converting to a physaddr and then finding the MR in the flatmap.未来的访问将在 TLB 中命中,无需重复转换为 physaddr 然后在平面图中找到 MR 的工作。 This is what is happening in your backtrace -- the io_readx() function is passed the guest virtual address and also the relevant part of the TLB data structure, and it can then directly find the target MR and the offset within it, so it can call memory_region_dispatch_read() to dispatch the read request to that MR's read callback function.这就是你的回溯中发生的事情——io_readx() function 传递了客户虚拟地址以及 TLB 数据结构的相关部分,然后它可以直接找到目标 MR 和其中的偏移量,所以它可以调用 memory_region_dispatch_read() 将读取请求分派到 MR 的读取回调 function。 (If this was the first access, the initial "MMU walk + FlatView lookup" work will have just been done in load_helper() before it calls io_readx().) (如果这是第一次访问,最初的“MMU walk + FlatView 查找”工作将在 load_helper() 调用 io_readx() 之前完成。)

Obviously, all this caching also implies that QEMU tracks events which mean the cached data is no longer valid so we can throw it away (eg if the guest writes to the BAR again to unmap it or to map it somewhere else; or if the MMU settings or page tables are changed to alter the guest virtual-to-physical mapping).显然,所有这些缓存也意味着 QEMU 跟踪事件,这意味着缓存的数据不再有效,因此我们可以将其丢弃(例如,如果来宾再次写入 BAR 以取消映射它或 map 它在其他地方;或者如果 MMU更改设置或页表以更改来宾虚拟到物理的映射)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM