简体   繁体   中英

Fatal exception in interrupt in network driver polling

I am developing a custom network device driver for my project, in which unlike Ethernet drivers I don't have an interrupt for receiving the packets (design limitation). So, I use polling for the receiving part of my driver. I have implemented a polling mechanism with tasklets in Linux (borrowed mostly from jit.c example of LDD3) that it reschedules the polling function for 10000 times (a somewhat random number) to make a delay between each two pollings. It works fine, but I decided to make it a timer based implementation to avoid extra overhead. I used HRtimers, workqueues and a timer calling a tasklet but all of them face this error

Kernel panic - not syncing: Fatal exception in interrupt

in the eth_type_trans function. Following is the panic error detail that I get:

[ 5031.345599] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 5031.346090] pgd = ffffffc07cf7f000
[ 5031.346471] [00000000] *pgd=0000000000000000
[ 5031.346988] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[ 5031.347383] Modules linked in: alex_mcn(O)
[ 5031.348144] CPU: 0 PID: 601 Comm: systemd-journal Tainted: G           O  3.16.0-rc6 #1
[ 5031.348744] task: ffffffc07cff5c40 ti: ffffffc07cf64000 task.ti: ffffffc07cf64000
[ 5031.349303] PC is at eth_type_trans+0x5c/0x164
[ 5031.349913] LR is at polling_tasklet_fn+0x84/0x144 [alex_mcn]

and then it gives me the stack trace:

[ 5031.406316] Call trace:  
[ 5031.406798] [<ffffffc00045d604>] eth_type_trans+0x5c/0x164 
[ 5031.407482] [<ffffffbffc000310>] polling_tasklet_fn+0x80/0x144 [alex_mcn] 
[ 5031.408114] [<ffffffc000099198>] tasklet_hi_action+0xc4/0x198
[ 5031.408716] [<ffffffc0000995bc>] __do_softirq+0x10c/0x220 
[ 5031.409304] [<ffffffc00009993c>] irq_exit+0x8c/0xc0 
[ 5031.409882] [<ffffffc000084514>] handle_IRQ+0x6c/0xe0 
[ 5031.410440] [<ffffffc000081290>] gic_handle_irq+0x3c/0x80

My initial code that works is:

static int alex_mcn_single_rx(void){
   struct sk_buff *skb;
   ...
   skb = netdev_alloc_skb(net_dev, pktLen+5);
   ...
   skb->protocol = eth_type_trans(skb, net_dev);
   skb->ip_summed = CHECKSUM_UNNECESSARY; /* don't check it */
   if( netif_rx(skb) == NET_RX_SUCCESS){
     net_dev->stats.rx_packets++;
   }
   else{
     printk("!!! Failure in receiving the packet\n");
     return 1;
   }

  return 0;
}

static void polling_tasklet_fn(unsigned long arg)
{
  polling_data->count++;
  if(polling_data->loops){
      if((polling_data->count)%10000==0) 
      {
          alex_mcn_single_rx();
      }
  }
  tasklet_schedule(&polling_data->tlet);
}
static void init_polling_tasklet(char * buf){

  polling_data->count = 0;
  polling_data->loops = 1;

  /* register the tasklet */
  tasklet_init(&polling_data->tlet, polling_tasklet_fn, 0);
  tasklet_hi_schedule(&polling_data->tlet);

}

This code works, but when I remove the if(polling_data->loops) statement it stops working and gives me the same error as mentioned above. Which it does not make any sense to me since there is no race condition in tasklets. Also, I know that eth_type_trans is the only culprit. It does not encounter any error when I remove it (though the packet will be dropped then). I would appreciate it if someone can give me some clue why this is happening.

ps: I am using gem5 simulator with ARMv8 arch. to test my design.

Solved : I ended up copying the eth_type_trans() function to my device driver and debugging the problem with printks. It was easier to debug it this way than rebuilding the kernel (it takes a lot of time for the simulator). The function started working properly after I copied the eth_trans_type() function into my code and started debugging it inside my device driver

How do you get net_dev ? The dot placeholders (...) look murky. I presume, netdev_alloc_skb() will not crash on net_dev = NULL since it doesn't dereference it, but eth_type_trans() needs a correct net_dev pointer. Do you have a proper net_dev at the time when eth_type_trans() is called?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM