[英]Streaming DMA in PCIE linux kernel driver
I'm working on FPGA driver for Linux kernel. 我正在研究用于Linux内核的FPGA驱动程序。 Code seems to work fine on x86, but on x86_64 I've got some problems.
代码似乎在x86上工作正常,但在x86_64上我遇到了一些问题。 I implemented streaming DMA.
我实现了流式DMA。 So it goes like
所以它就像
get_user_pages(...);
for (...) {
sg_set_page();
}
pci_map_sg();
But pci_map_sg
returned addresses like 0xbd285800
, which are not aligned by PAGE_SIZE
, so I can't send full first page, because PCIE specification says 但
pci_map_sg
返回的地址如0xbd285800
,没有通过PAGE_SIZE
对齐,所以我无法发送完整的第一页,因为PCIE规范说
"Requests must not specify an Address/Length combination which causes a Memory Space access to cross a 4-KB boundary."
“请求不得指定地址/长度组合,这会导致内存空间访问跨越4 KB边界。”
Is there any way to get aligned addresses, or did I just missed something important? 有没有办法获得对齐的地址,还是我错过了一些重要的东西?
The first possibility that comes to mind is that the user buffer coming in does not start on a page boundary. 想到的第一种可能性是进入的用户缓冲区不会在页面边界上开始。 If your start address is 0x800 bytes through a page, then the offset on your first
sg_set_page
call will be 0x800. 如果您的起始地址是一个页面的0x800字节,那么第一个
sg_set_page
调用的偏移量将为0x800。 This will produce a DMA address ending in 0x800. 这将产生一个以0x800结尾的DMA地址。 This is a normal thing to happen, and not a bug.
这是正常的事情,而不是一个错误。
As pci_map_sg
coalesces pages, this first segment may be larger than one page. 由于
pci_map_sg
合并页面,因此第一个段可能大于一个页面。 The important thing is that pci_map_sg
produces contiguous blocks of DMA addressable memory, but it does not produce a list of low-level PCIe transactions. 重要的是
pci_map_sg
产生连续的DMA可寻址内存块,但它不会产生低级PCIe事务列表。 On x64 you are more likely to get a large region, because most x64 platforms have an IOMMU. 在x64上,您更有可能获得一个大区域,因为大多数x64平台都有一个IOMMU。
Many devices I deal with have DMA engines that allow me to specify a logical transfer length of several megabytes. 我处理的许多设备都有DMA引擎,允许我指定几兆字节的逻辑传输长度。 Normally the DMA implementation in the PCIe endpoint is responsible for starting a new PCIe transaction at each 4kB boundary, and the programmer can ignore that constraint.
通常,PCIe端点中的DMA实现负责在每个4kB边界处启动新的PCIe事务,并且程序员可以忽略该约束。 If resources in the FPGA are too limited to handle that, you can consider writing driver code to convert the Linux list of memory blocks into a (much longer) list of PCIe transactions.
如果FPGA中的资源太有限,无法处理,您可以考虑编写驱动程序代码,将Linux内存块列表转换为(更长)PCIe事务列表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.