简体繁体 English

数据采集设备的Linux驱动程序和API架构

[英]Linux Driver and API architecture for a data acquisition device

原文 2015-08-07 11:09:57 6 1 memory-management/ linux-kernel/ driver/ linux-device-driver/ dma

We're trying to write a driver/API for a custom data acquisition device, which captures several "channels" of data. 我们正在尝试为自定义数据获取设备编写驱动程序/ API，该设备可捕获多个数据“通道”。 For the sake of discussion, let's assume this is a several-channel video capture device. 为了便于讨论，我们假设这是一个多通道视频捕获设备。 The device is connected to the system via an 8xPCIe Gen-1 link, which has a theoretical throughput of 16Gbps. 该设备通过8xPCIe Gen-1链路连接到系统，其理论吞吐量为16Gbps。 Our actual data rate will be around 2.8Gbps (~350MB/sec). 我们的实际数据速率约为2.8Gbps（〜350MB /秒）。

Because of the data rate requirement, we think we have to be careful about the driver/API architecture. 由于数据速率要求，我们认为我们必须对驱动程序/ API体系结构保持谨慎。 We've already implemented a descriptor based DMA mechanism and the associated driver. 我们已经实现了基于描述符的DMA机制和相关的驱动程序。 For example, we can start a DMA transaction for 256KB from the device and it completes successfully. 例如，我们可以从设备启动一个256KB的DMA事务，该事务成功完成。 However, in this implementation we're only capturing the data in the kernel driver, and then dropping it and we aren't streaming the data to the user-space at all. 但是，在此实现中，我们仅在内核驱动程序中捕获数据，然后将其删除，并且根本不将数据流传输到用户空间。 Essentially, this is just a small DMA test implementation. 本质上，这只是一个小的DMA测试实现。

We think we have to separate the problem into three sections: 1. Kernel driver 2. Userspace API 3. User Code 我们认为我们必须将问题分为三个部分：1.内核驱动程序2.用户空间API 3.用户代码

The acquisition device has a register in the PCIe address space which indicates whether there is data to read for any channel from the device. 采集设备在PCIe地址空间中有一个寄存器，该寄存器指示是否有数据要从该设备的任何通道读取。 So, our kernel driver must poll for this bit-vector. 因此，我们的内核驱动程序必须轮询此位向量。 When the kernel driver sees this bit set, it starts a DMA transaction. 当内核驱动程序看到该位置1时，它将启动DMA事务。 The user application however does not need to know about all these DMA transactions and data, until an entire chunk of data is ready (For example, assume that the device provides us with 16 lines of video data per transaction, but we need to notify the user only when the entire video frame is ready). 但是，在准备好整个数据块之前，用户应用程序不需要了解所有这些DMA事务和数据（例如，假定该设备每个事务为我们提供16行视频数据，但是我们需要通知仅当整个视频帧就绪时才可以使用）。 We need to only transfer entire frames to the user application. 我们只需要将整个帧传输到用户应用程序。

Here was our first attempt: 这是我们的首次尝试：

Our user-side API allows a user application to register a function callback for a "channel". 我们的用户端API允许用户应用程序为“通道”注册函数回调。
The user-side API has a "start" function, which can be called by the user application, which uses ioctl to send a start message to the kernel driver. 用户端API具有“启动”功能，可由用户应用程序调用，该应用程序使用ioctl向内核驱动程序发送启动消息。
In the kernel driver, upon receiving the start message, we started a kernel thread, which continuously monitors the "data ready" bit-vector, and when it sees new data, copies it over to a driver-allocated (kmalloc) buffer. 在内核驱动程序中，接收到启动消息后，我们启动了一个内核线程，该线程连续监视“数据就绪”位向量，并在看到新数据时将其复制到驱动程序分配的（kmalloc）缓冲区中。 It keeps doing this until the size of the collected data reaches the "frame size". 它会一直这样做，直到收集到的数据大小达到“帧大小”为止。
At this point a custom linux SIGNAL (similar to SIGINT, SIGHUP, etc) is sent to the process which is running the driver. 此时，将自定义linux SIGNAL（类似于SIGINT，SIGHUP等）发送到运行驱动程序的进程。 Our API catches this signal and then calls back the appropriate user callback function. 我们的API会捕获此信号，然后回调相应的用户回调函数。
The user callback function calls a function in the API (transfer_data), which uses an ioctl call to send a userspace buffer address to the kernel, and the kernel completes the data transfer by doing a copy_to_user of the channel frame data to userspace. 用户回调函数调用API中的函数（transfer_data），该函数使用ioctl调用将用户空间缓冲区地址发送到内核，内核通过将通道帧数据复制到用户空间来完成数据传输。

All of the above is working OK, except that the performance is abysmal. 上面的所有方法都可以正常工作，只是性能很差。 We can only achieve about 2MB/sec of transfer rate. 我们只能达到大约2MB /秒的传输速率。 We need to completely re-write this and we're open to any suggestions or pointers to examples. 我们需要完全重写它，我们对任何建议或示例指针都持开放态度。

Other notes: 其他说明：

Unfortunately, we can not change anything in the hardware device. 不幸的是，我们无法更改硬件设备中的任何内容。 So we must poll for the "data-ready" bit and start DMA based on that bit. 因此，我们必须轮询“数据就绪”位并基于该位启动DMA。
Some people suggested to look at Infiniband drivers as a reference, but we're completely lost in that code. 有人建议参考Infiniband驱动程序作为参考，但是我们完全不了解该代码。

1 个解决方案

You're probably way past this now, but if not here's my 2p. 您现在可能已经过去了，但如果没有，这是我的2分。

It's hard to believe that your card can't generate interrupts when it has transferred data. 很难相信您的卡在传输数据后不会产生中断。 It's got a DMA engine, and it can handle 'descriptors', which are presumably elements of a scatter-gather list. 它具有DMA引擎，并且可以处理“描述符”，这大概是分散聚集列表的元素。 I'll assume that it can generate a PCIe 'interrupt'; 我假设它会产生PCIe“中断”。 YMMV. YMMV。
Don't bother trawling the kernel for existing similar drivers. 不要为现有的类似驱动程序而拖曳内核。 You might get lucky, but I suspect not. 您可能会很幸运，但我怀疑不是。

You need to write a blocking read, which you supply a large memory buffer to. 您需要写一个阻塞读取，您将为其提供大的内存缓冲区。 The driver read op (a) gets gets a list of user pages for your user buffer and locks them in memory ( get_user_pages ); 读取操作op（a）的驱动程序获取用户缓冲区的用户页面列表，并将其锁定在内存中（ get_user_pages ）； (b) creates a scatter list with pci_map_sg ; （b）使用pci_map_sg创建一个分散列表； (c) iterates through the list ( for_each_sg ); （c）遍历列表（ for_each_sg ）; (d) for each entry writes the corresponding physical bus address and data length to the DMA controller as what I presume you're calling a 'descriptor'. （d）对于每个条目，将相应的物理总线地址和数据长度写入DMA控制器，就像我假设您所说的“描述符”一样。

The card now has a list of descriptors which correspond to the physical bus addresses of your large user buffer. 该卡现在具有一个描述符列表，这些描述符与大型用户缓冲区的物理总线地址相对应。 When data arrives at the card, it writes it directly into user space, into your user buffer, while your user-level read is still blocked. 当数据到达卡时，它将数据直接写到用户空间，用户缓冲区中，而用户级别的读取仍被阻止。 When it has finished the descriptor list, the card has to be able to interrupt, or it's useless. 完成描述符列表后，该卡必须能够中断，否则就没用了。 The driver responds to the interrupt and unblocks your user-level read. 驱动程序响应该中断并取消阻止您的用户级读取。

And that's it. 就是这样。 The details are nasty, of course, and poorly documented, but that should be the basic architecture. 当然，这些细节是令人讨厌的，并且文档记录很少，但这应该是基本的体系结构。 If you really haven't got interrupts you can set up a timer in the kernel to poll for completion of transfer, but if it is really a custom card you should get your money back. 如果您确实没有中断，则可以在内核中设置一个计时器以轮询传输是否完成，但是如果它确实是定制卡，则应该退还您的钱。