简体   繁体   中英

PCIe - DMA: Consistent vs. Streaming Memory

Currently I'm adding DMA to my PCIe driver for Linux. As I'm reading through the documentation it makes mention of consistent, or coherent, memory by using the API:

pci_set_consistent_dma_mask(...)

but never really talks about why to use it or what it does. It seems to mention to call the function for best practices and future proofing. The best I can gather is that consistent DMA memory does not have cache effects and the memory is written between device (FPGA) and CPU without any software/driver intervention once set up correctly (assuming I read correctly). So my questions are:

  1. Assuming a PCIe device does not require consistent memory then why would anyone use it, or in what cases is consistent memory used?
  2. If I use consistent memory then do I not need to implement an interrupt in the PCIe driver for DMA? If true, then how does the userpsace code and device know a transfer has occurred?
  3. If I transfer a lot of small packets, ~50 bytes, continuously and on occasion larger packets, ~6 kB, which DMA memory is better: consistent or streaming?

Think about it this way: "Consistent" means it will be automatically coherent between CPU and bus without doing anything to specifically synchronize it . For example - say I have a memory ring for inbound and outbound packets. It's lifespan will be the entire time the system is in use, and I'm going to be checking it all the time. I want this to be always consistent, because if it isn't I would have to (manually) flush or synchronize the caches, and if this were costly, and I had to do this very time I touched the ring - it would be nightmare.

On the other hand - let's take a single data buffer I'm transferring. I't kind of a "one off" deal. I can let the device transfer it - and maybe it takes many PCI cycles to complete the DMA. And maybe this is inconsistent . That's okay - but when it's done I can flush/sync caches/force consistency. If it took a tiny bit of extra time to do so - no problem - because I'm just doing it once .

So you might ask "why not make everything consistent". Answer is there is generally some level of overhead to make things consistent. Depending on the architecture, this could be significant. So in such cases, there are provisions to allow for inconsistent (streaming) mappings which don't do cache consistency (but require an explicit sync). So allowing an inconsistent transfer could gain you some performance.

Remember too - there are some cases where you would never need any consistency. For example - reading a buffer from a network device to memory, then writing that memory to a disk controller. This data may never be read/used by the CPU at all - so why bother placing any overhead on the CPU cache to track it.

As for you comment about the "interrupt" - this is kind of odd. In a "normal" case - you might have a control structure in consistent memory (like a Tx/Rx rings) which you could poll to tell you if the transaction was done. But the actual data transferred would be in a different memory which could be streaming or non-consistent.

1)Imagine you want to transfer a huge amount of data via PCIE, with high rate. you have to use scatter/gather list, and you can use a consistent memory for prepare this list for FPGA, so FPGA can read this list very fast and then do the transmissions.

2)Of course you need interrupts, otherwise you have to use polling which is very slow and unreliable.

3)If you use larger consistent memory, you can minimize interrupt/polling overheads, so they are faster, but windows usually don't allow you to allocate large consistent memory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM