在PCIE linux内核驱动程序中流式传输DMA

Question

I'm working on FPGA driver for Linux kernel. 我正在研究用于Linux内核的FPGA驱动程序。 Code seems to work fine on x86, but on x86_64 I've got some problems. 代码似乎在x86上工作正常，但在x86_64上我遇到了一些问题。 I implemented streaming DMA. 我实现了流式DMA。 So it goes like 所以它就像

get_user_pages(...);
for (...) {
    sg_set_page();
}
pci_map_sg();

But pci_map_sg returned addresses like 0xbd285800 , which are not aligned by PAGE_SIZE , so I can't send full first page, because PCIE specification says 但pci_map_sg返回的地址如0xbd285800 ，没有通过PAGE_SIZE对齐，所以我无法发送完整的第一页，因为PCIE规范说

"Requests must not specify an Address/Length combination which causes a Memory Space access to cross a 4-KB boundary." “请求不得指定地址/长度组合，这会导致内存空间访问跨越4 KB边界。”

Is there any way to get aligned addresses, or did I just missed something important? 有没有办法获得对齐的地址，还是我错过了一些重要的东西？

Source code of DMA . DMA的源代码。

Answer 1

The first possibility that comes to mind is that the user buffer coming in does not start on a page boundary. 想到的第一种可能性是进入的用户缓冲区不会在页面边界上开始。 If your start address is 0x800 bytes through a page, then the offset on your first sg_set_page call will be 0x800. 如果您的起始地址是一个页面的0x800字节，那么第一个sg_set_page调用的偏移量将为0x800。 This will produce a DMA address ending in 0x800. 这将产生一个以0x800结尾的DMA地址。 This is a normal thing to happen, and not a bug. 这是正常的事情，而不是一个错误。

As pci_map_sg coalesces pages, this first segment may be larger than one page. 由于pci_map_sg合并页面，因此第一个段可能大于一个页面。 The important thing is that pci_map_sg produces contiguous blocks of DMA addressable memory, but it does not produce a list of low-level PCIe transactions. 重要的是pci_map_sg产生连续的DMA可寻址内存块，但它不会产生低级PCIe事务列表。 On x64 you are more likely to get a large region, because most x64 platforms have an IOMMU. 在x64上，您更有可能获得一个大区域，因为大多数x64平台都有一个IOMMU。

Many devices I deal with have DMA engines that allow me to specify a logical transfer length of several megabytes. 我处理的许多设备都有DMA引擎，允许我指定几兆字节的逻辑传输长度。 Normally the DMA implementation in the PCIe endpoint is responsible for starting a new PCIe transaction at each 4kB boundary, and the programmer can ignore that constraint. 通常，PCIe端点中的DMA实现负责在每个4kB边界处启动新的PCIe事务，并且程序员可以忽略该约束。 If resources in the FPGA are too limited to handle that, you can consider writing driver code to convert the Linux list of memory blocks into a (much longer) list of PCIe transactions. 如果FPGA中的资源太有限，无法处理，您可以考虑编写驱动程序代码，将Linux内存块列表转换为（更长）PCIe事务列表。

在PCIE linux内核驱动程序中流式传输DMA

问题描述

1 个解决方案

解决方案1
3 已采纳 2012-02-22 09:06:48

在PCIE linux内核驱动程序中流式传输DMA

问题描述

1 个解决方案

解决方案1 3 已采纳 2012-02-22 09:06:48

解决方案1
3 已采纳 2012-02-22 09:06:48