Linux Kernel 设备驱动写入回调是在整个分配给设备驱动的 memory 空间中写入数据

Question

I have a Xilinx SoC, and have created a simple multiplier on the programmable logic via verilog.我有一个 Xilinx SoC，并通过 verilog 在可编程逻辑上创建了一个简单的乘法器。 The multiplier takes two 16 bit input, multiplies them and returns a 32 bit output.乘法器接受两个 16 位输入，将它们相乘并返回一个 32 位 output。 The digital design have been packaged and linked to the processor system that is within the SoC via an AXI-Lite interface.数字设计已通过 AXI-Lite 接口封装并链接到 SoC 内的处理器系统。 The Xilinx tools have auto-generated a device tree entity for this digital design so that a custom linux device driver can be created to interact with the digital design (ie the PS will treat it just like an external hardware device connected to the ARM processor). Xilinx 工具已为此数字设计自动生成设备树实体，因此可以创建自定义 linux 设备驱动程序以与数字设计交互（即 PS 会将其视为连接到 ARM 处理器的外部硬件设备） .

The device tree generated looks like this:生成的设备树如下所示：

/ {
    amba_pl: amba_pl@0 {
        #address-cells = <2>;
        #size-cells = <2>;
        compatible = "simple-bus";
        ranges ;
        multi2_0: multi2@a0000000 {
            clock-names = "s00_axi_aclk";
            clocks = <&zynqmp_clk 71>;
            compatible = "xlnx,multi2-1.0";
            reg = <0x0 0xa0000000 0x0 0x10000>;
            xlnx,s00-axi-addr-width = <0x4>;
            xlnx,s00-axi-data-width = <0x20>;
        };
    };
};

So from the device tree we can see that the multiplier ("multi2-1.0") has a physical memory address of 0xa0000000, with address width of 0x4 and data width of 32 bits.所以从设备树中我们可以看到乘法器（“multi2-1.0”）的物理memory地址为0xa0000000，地址宽度为0x4，数据宽度为32位。

So, from the device driver point of view, specifically in the write call-back function, I am writing a 32-bit number into the virtual memory address that was retrieved from "ioremap(.)" function.因此，从设备驱动程序的角度来看，特别是在写回调 function 中，我正在将一个 32 位数字写入从“ioremap(.)”检索到的虚拟 memory 地址 ZC1C4245268E67A94D11

A sanity check was done to see the virtual memory mapping to physical address, and it seems to be correctly done with no errors (some memory-related code snippets from the driver is shown below):进行了完整性检查以查看虚拟 memory 映射到物理地址，并且它似乎正确完成，没有错误（驱动程序中的一些与内存相关的代码片段如下所示）：

struct simpmod_local {
    int irq;
    unsigned long mem_start;
    unsigned long mem_end;
    void __iomem *base_addr;
};
struct simpmod_local *lp = NULL;

...... ……

static int simpmod_probe(struct platform_device *pdev)
{ .....
lp->base_addr = ioremap(lp->mem_start, lp->mem_end - lp->mem_start + 1);
...
dev_info(dev,"simpmod at 0x%08x mapped to 0x%08x, irq=%d\n",
        (unsigned int __force)lp->mem_start,
        (unsigned int __force)lp->base_addr,
        lp->irq); 
....
}

The write call-back function (as of now), is just taking a 32-bit number and placing it into memory.写回调 function（截至目前）只是取一个 32 位数字并将其放入 memory。 However, the read call-back function is just reading that exact same number even though I am reading from a base_address+0x20_offset.但是，即使我从 base_address+0x20_offset 读取，读取回调 function 也只是读取完全相同的数字。 I have tried changing the offset value, but regardless, it keeps reading that same number.我尝试更改偏移值，但无论如何，它一直在读取相同的数字。

My intuition tell me, that if reading from a different memory address the value should be either garbage value or zero, but its very unlikely to be reading the same value written to the base address.我的直觉告诉我，如果从不同的 memory 地址读取该值应该是垃圾值或零，但它不太可能读取写入基地址的相同值。 Any ideas to why the written data is copied across the entire allocated memory space?为什么要在整个分配的 memory 空间中复制写入的数据？

Even doing a devmem command <devmem 0xa0000000 w 52> will produce the output 52 when executing <devmem 0xa0000020 w> or <devmem 0xa0000040 w> or......即使执行 devmem 命令 <devmem 0xa0000000 w 52> 也会在执行 <devmem 0xa0000020 w> 或 <devmem 0xa0000040 w> 或......

The write-callback function looks like this:写回调 function 如下所示：

static ssize_t dev_write(struct file *fil, const char *buf, size_t len, loff_t *off){
  sscanf (buf,"%d,%d",&operand_1,&operand_2);
  ker_buf[len] = 0 ;
  iowrite32((unsigned int) operand_1, lp->base_addr);
  return len;
}

The full project code (with minor changes) can be found on https://forums.xilinx.com/t5/Embedded-Linux/Memory-Replications-during-write-call-back-function-in-Linux/mp/1212405完整的项目代码（稍作改动）可在https://forums.xilinx.com/t5/Embedded-Linux/Memory-Replications-during-write-call-back-function-in-Linux/mp/1212405上找到

Answer 1

Caveat: This isn't so much a solution as some observations and things to try [in no particular order].警告：这与其说是一个解决方案，不如说是一些观察和尝试的事情[无特定顺序]。

At present, you've got multiple potential sources of error: bad H/W logic, incorrect device driver.目前，您有多个潜在的错误来源：硬件逻辑错误、设备驱动程序错误。

From the linked driver code, most return statements return an error code (eg -ENOMEM ) but some do return -1 .从链接的驱动程序代码中，大多数return语句返回错误代码（例如-ENOMEM ），但有些确实return -1 。 This is inconsistent.这是不一致的。

As I mentioned in my comments, you've got a bunch of globals.正如我在评论中提到的，你有很多全局变量。 There is no interthread locking.没有线程间锁定。 So, you could have race conditions.所以，你可能有竞争条件。

I presume you're booting petalinux .我想你正在启动petalinux 。 And, it is working as long as you don't access your device.而且，只要您不访问您的设备，它就可以工作。 This is a big deal [in a good way].这是一件大事[以一种好的方式]。

I'm assuming that you're communicating with it via a serial cable from your development system [running (eg) minicom ] to an onboard UART.我假设您通过串行电缆从您的开发系统 [运行（例如） minicom ] 到板载 UART 与它通信。 So, you get a login prompt and/or shell.因此，您会收到登录提示和/或 shell。

This means that the UART driver source [and the corresponding dtb/dts] is available.这意味着 UART 驱动程序源 [和相应的 dtb/dts] 可用。 You can use that as your reference driver.您可以将其用作参考驱动程序。 Or, something else like GPIO, etc.或者，像 GPIO 之类的其他东西。

I notice that you mentioned ZYNQ [which is a fairly popular Xilinx FPGA chip].我注意到您提到了ZYNQ [这是一种相当流行的 Xilinx FPGA 芯片]。 I'll assume that you're also using a standard SDK board with a ZYNQ chip on it.我假设您也在使用带有ZYNQ芯片的标准 SDK 板。 So, Vivado will already know about the board interconnect/layout.因此，Vivado 已经了解了电路板互连/布局。

And, I assume that Vivado is able to pass off the board definition to Xilinx's S/W SDK/builder, so that it can build a compatible petalinux kernel.而且，我假设 Vivado 能够将板定义传递给 Xilinx 的 S/W SDK/builder，以便它可以构建兼容的petalinux kernel。

I have never seen writing a value, and reading it correctly but having that data replicated throughout memory.我从未见过写入值并正确读取它，但在整个 memory 中复制了该数据。

This means that the address matching logic in your device is responding not merely to its assigned address range, but many more addresses that it shouldn't.这意味着您设备中的地址匹配逻辑不仅响应其分配的地址范围，而且响应更多不应该的地址。 There could be overlap with other devices and they could be contending/racing.可能与其他设备重叠，它们可能正在竞争/竞赛。

I'm no Vivado expert, but...我不是Vivado 专家，但是...

From your link, looking at the .png for one of the Vivado windows, it says that the AXI BASEADDR is 0xFFFFFFFF and that AXI HIGHADDR is 0x00000000 .从您的链接中，查看 Vivado windows 之一的.png ，它说AXI BASEADDR是0xFFFFFFFF而AXI HIGHADDR是0x00000000 。 Both have a blue i on them.两者都有一个蓝色的i 。

These are highly suspicious to me because I think these values should match with the values in the DTB entry.这些对我来说非常可疑，因为我认为这些值应该与 DTB 条目中的值匹配。 And, the BASEADDR value makes no sense to me.而且， BASEADDR值对我来说毫无意义。

I'm wondering if the DTB could be generated to some sane address but the actual H/W logic generated is different.我想知道是否可以将 DTB 生成到某个健全的地址，但生成的实际硬件逻辑是不同的。

This could easily cause all the symptoms you're seeing.这很容易导致您看到的所有症状。

One thing that might help is to add chipscope to the H/W design so you can debug your H/W logic and/or observe any access to a given port/address range.可能有帮助的一件事是将chipscope添加到硬件设计中，以便您可以调试硬件逻辑和/或观察对给定端口/地址范围的任何访问。

You're using copy_to_user et.您正在使用copy_to_user等。 al.人。 But, this can fail and you're not checking the error code.但是，这可能会失败，并且您没有检查错误代码。 I'd also do a printk on the arguments being passed.我还会对正在传递的 arguments 进行printk 。

There is no guarantee that the len value passed to dev_read/dev_write is sufficient to contain the transfer size.无法保证传递给dev_read/dev_write的len值足以包含传输大小。 In dev_read , you do ioread32 .在dev_read中，您执行ioread32 。 But, then you do: int n = sprintf(ker_buf, "%d\n", read_val);但是，你这样做： int n = sprintf(ker_buf, "%d\n", read_val); You're not checking n vs len to ensure there's enough room.您没有检查n与len以确保有足够的空间。 And, you're not examining/honoring loff_t而且，您不是在检查/尊重loff_t

Both these functions are passed a struct file pointer.这两个函数都传递了一个struct file指针。 But, this value is ignored in favor of the global variables you've already set.但是，这个值会被忽略，取而代之的是您已经设置的全局变量。 As, I mentioned in my top comments, using these globals is problematic.正如我在顶级评论中提到的那样，使用这些全局变量是有问题的。 You should use the passed pointer to find the appropriate struct pointers and [ultimately] your private device struct simpmod_local .您应该使用传递的指针来找到适当的struct指针和 [最终] 您的私有设备 struct simpmod_local 。

Your dev_write should store the values from userspace into the private struct.您的dev_write应该将来自用户空间的值存储到私有结构中。 The dev_read should get them from there. dev_read应该从那里得到它们。

Here's a total guess : Most designs I've seen use full AXI rather than AXI lite.这是一个总的猜测：我见过的大多数设计都使用完整的AXI而不是AXI lite。 I know nothing about what constitutes an "AXI thread ID", so I don't know what the implications of your access code bouncing between cores might be [if anything].我对“AXI 线程 ID”的构成一无所知，所以我不知道您的访问代码在内核之间弹跳的含义可能是什么 [如果有的话]。

Using dev_write/dev_read as you're doing isn't atomic.像你一样使用dev_write/dev_read不是原子的。 I think, at present, you've got more fundamental issues.我认为，目前，你有更根本的问题。 But, long term, I'd replace this with an ioctl call that takes a struct , such as:但是，从长远来看，我会用一个带有struct的ioctl调用来替换它，例如：

struct mymult_user {
    u32 operand_1;
    u32 operand_2;
    u32 result;
};

The ioctl call does copy_from_user on this. ioctl调用对此进行了copy_from_user 。 Sends these values to the H/W, gets back the result.将这些值发送到 H/W，取回结果。 And, returns the result to the ioctl caller.并且，将结果返回给 ioctl 调用者。 Or, it can do a copy_to_user on the result field in the struct .或者，它可以在struct的result字段上执行copy_to_user 。

Overall, you're more likely to get a [useful] response on Xilinx's forum page [as it's frequented by people who do this stuff all the time].总体而言，您更有可能在 Xilinx 的论坛页面上获得 [有用的] 响应 [因为一直在做这些事情的人经常光顾它]。

UPDATE:更新：

Something else I noticed.我注意到了别的东西。

The DTB entry specifies the AXI data width to be 0x20. DTB 条目将 AXI 数据宽度指定为 0x20。 This is 32 bytes ??这是32字节？？ It's autogenerated so it must be correct;-) But, this seems excessive to me.它是自动生成的，所以它必须是正确的；-) 但是，这对我来说似乎太过分了。 It may just be related to the width of the AXI data bus, so, maybe not an issue...它可能只是与 AXI 数据总线的宽度有关，所以，也许不是问题......

But, looking at the driver, the offsets from the base address don't seem to match up.但是，查看驱动程序，基地址的偏移量似乎不匹配。

operand_1 is offset 0x10, operand_2 is offset 0x20, and the result is offset 0x30. operand_1是偏移量0x10， operand_2是偏移量0x20，结果是偏移量0x30。 So, what's at offset 0x0???那么，偏移量 0x0 是什么？

The width of the AXI bus and the width of the registers may not be strictly related. AXI 总线的宽度和寄存器的宽度可能没有严格的关系。

One way to view this is that the offsets should be aligned to the bus width: 0x0, 0x20, 0x40.一种查看方式是偏移量应与总线宽度对齐：0x0、0x20、0x40。

But, ordinarily, I'd expect things to be more closely packed.但是，通常情况下，我希望事情会更加紧凑。 (eg) offsets 0x0, 0x2, 0x4 respectively. （例如）分别偏移 0x0、0x2、0x4。

It might be less painful [less chance of memory/bus corruption] to just do ioread* while debugging.在调试时只做ioread*可能不会那么痛苦[减少内存/总线损坏的机会]。 Since you're not writing to the address space, it's less likely to corrupt other memory cells and the system may stay alive [uncorrupted] longer.由于您没有写入地址空间，因此它不太可能损坏其他 memory 单元，并且系统可能会更长时间地保持 [未损坏]。 This would only give you whatever value was in result reg initially.这只会给你最初结果 reg 中的任何值。

Also, you could write the operands and loop on ioread32 for offsets (eg) 0x0-0x40 and printk those values.此外，您可以在ioread32上编写操作数和循环以获取偏移量（例如） printk并打印这些值。

Linux Kernel 设备驱动写入回调是在整个分配给设备驱动的 memory 空间中写入数据

问题描述

1 个解决方案

解决方案1
1 2021-03-04 00:31:54

Linux Kernel 设备驱动写入回调是在整个分配给设备驱动的 memory 空间中写入数据

问题描述

1 个解决方案

解决方案1 1 2021-03-04 00:31:54

解决方案1
1 2021-03-04 00:31:54