简体   繁体   中英

segmentation fault when using a non NULL pointer

there is a weird problem as title when using dpdk,

When I use rte_pktmbuf_alloc(struct rte_mempool *) and already verify the return value of rte_pktmbuf_pool_create() is not NULL, the process receive segmentation fault.

Follow

ing message is output of gdb in dpdk source code:

Thread 1 "osw" received signal SIGSEGV, Segmentation fault.

0x00000000005e9f41 in __mempool_generic_get (cache=0x1a7dfc000000000, n=1, obj_table=0x7fffffffdec8, mp=0x101a7df00)at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1449

1449            if (unlikely(cache == NULL || n >= cache->size))

(gdb) p cache

$1 = (struct rte_mempool_cache *) 0x1a7dfc000000000

(gdb) bt

0  0x00000000005e9f41 in __mempool_generic_get (cache=0x1a7dfc000000000, n=1, obj_table=0x7fffffffdeb8, mp=0x101a7df00)

   at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1449

1  rte_mempool_generic_get (cache=0x1a7dfc000000000, n=1, obj_table=0x7fffffffdeb8, mp=0x101a7df00)

   at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1517

2  rte_mempool_get_bulk (n=1, obj_table=0x7fffffffdeb8, mp=0x101a7df00)

   at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1552

3  rte_mempool_get (obj_p=0x7fffffffdeb8, mp=0x101a7df00) at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1578

4  rte_mbuf_raw_alloc (mp=0x101a7df00) at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:586

5  rte_pktmbuf_alloc (mp=0x101a7df00) at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:896

And I dig into rte_mempool.h:

and change line 1449-1450

1449  if (unlikely(cache == NULL || n >= cache->size))

1450         goto ring_dequeue;

to

1449  if (unlikely(cache == NULL))

1450          goto ring_dequeue;

1451  if (unlikely(n >= cache->size))

1452          goto ring_dequeue;

and it also fail at line 1451

the gdb output message after changing:

Thread 1 "osw" received signal SIGSEGV, Segmentation fault.

__mempool_generic_get (cache=0x1a7dfc000000000, n=1, obj_table=0x7fffffffdeb8, mp=0x101a7df00)
   at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1451

1451            if (unlikely(n >= cache->size))

(gdb) p cache

$1 = (struct rte_mempool_cache *) 0x1a7dfc000000000

(gdb) bt

0  __mempool_generic_get (cache=0x1a7dfc000000000, n=1, obj_table=0x7fffffffdeb8, mp=0x101a7df00)

   at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1451

1  rte_mempool_generic_get (cache=0x1a7dfc000000000, n=1, obj_table=0x7fffffffdeb8, mp=0x101a7df00)

   at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1519

2  rte_mempool_get_bulk (n=1, obj_table=0x7fffffffdeb8, mp=0x101a7df00)

   at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1554

3  rte_mempool_get (obj_p=0x7fffffffdeb8, mp=0x101a7df00) at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1580

4  rte_mbuf_raw_alloc (mp=0x101a7df00) at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:586

5  rte_pktmbuf_alloc (mp=0x101a7df00) at /root/dpdk-20.05/x86_64-native-linuxapp-gcc/include/rte_mbuf.h:896

6  main (argc=<optimized out>, argv=<optimized out>) at ofpd.c:150

(gdb) p cache->size

Cannot access memory at address 0x1a7dfc000000000

It looks like the memory address “cache” pointer stored is not NULL but it actually is a NULL pointer.

I have no idea that why does the "cache" pointer address be non zero at prefix 4 bytes and zero at postfix 4 bytes.

The DPDK version is 20.05, I also tried 18.11 and 19.11.

OS is CentOS 8.1 kernel is 4.18.0-147.el8.x86_64.

CPU is AMD EPYC 7401P.

#define                 RING_SIZE       16384
#define                 NUM_MBUFS       8191
#define                 MBUF_CACHE_SIZE 512

int main(int argc, char **argv)
{
    int             ret;
    uint16_t        portid;
    unsigned        cpu_id = 1;
    struct rte_mempool  *tmp;

    int arg = rte_eal_init(argc, argv);
    if (arg < 0) 
        rte_exit(EXIT_FAILURE, "Cannot init EAL: %s\n", rte_strerror(rte_errno));
    if (rte_lcore_count() < 10)
        rte_exit(EXIT_FAILURE, "We need at least 10 cores.\n");
    argc -= arg;
    argv += arg;

    /* Creates a new mempool in memory to hold the mbufs. */
    tmp = rte_pktmbuf_pool_create("TMP", NUM_MBUFS, MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
    if (tmp == NULL)
        rte_exit(EXIT_FAILURE, "Cannot create mbuf pool, %s\n", rte_strerror(rte_errno));
    printf("tmp addr = %x\n", tmp);
    struct rte_mbuf *test = rte_pktmbuf_alloc(tmp);
    rte_exit(EXIT_FAILURE, "end\n");
}

I have ever faced same problem when using the return pointer of getifaddrs(), it also got segmentation fault, I had to shift the pointer address like

ifa->ifa_addr = (struct sockaddr *)((uintptr_t)(ifa->ifa_addr) >> 32);

and then it can work normally.

Thereforer, I think this is not dpdk specific issue.

Does anyone know this issue?

Thanks.

I am able to run this without any error by modifying your code for

  1. include headers
  2. removed unused variables
  3. add check if the returned value is NULL or not for alloc

Test on:

CPU: Intel(R) Xeon(R) CPU E5-2699
OS: 4.15.0-101-generic
GCC: 7.5.0
DPDK version: 19.11.2, dpdk mainline
Library mode: static

code:

 int main(int argc, char **argv)
{
    int             ret = 0;
    struct rte_mempool  *tmp;

    int arg = rte_eal_init(argc, argv);
    if (arg < 0)
        rte_exit(EXIT_FAILURE, "Cannot init EAL: %s\n", rte_strerror(rte_errno));
    if (rte_lcore_count() < 10)
        rte_exit(EXIT_FAILURE, "We need at least 10 cores.\n");
    argc -= arg;
    argv += arg;

    /* Creates a new mempool in memory to hold the mbufs. */
    tmp = rte_pktmbuf_pool_create("TMP", NUM_MBUFS, MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
    if (tmp == NULL)
        rte_exit(EXIT_FAILURE, "Cannot create mbuf pool, %s\n", rte_strerror(rte_errno));
    printf("tmp addr = %p\n", tmp);
    struct rte_mbuf *test = rte_pktmbuf_alloc(tmp);

    if (test == NULL)
        rte_exit(EXIT_FAILURE, "end\n");

   return ret;
}

[EDIT-1] based on the comment Brydon Gibson

Note:

  1. As I do not have access to your codebase or working code snippet my suggestion is to lookup any example code from DPDK/examples/l2fwd or DPDK/examples/skeleton and copy the headers for your compilation.
  2. I assume both the author THE and Brydon are different individuals and might be facing similar on the different code bases.
  3. current question claims with DPDK version 20.05, 18.11 and 19.11 the error is reproduced with code snippet.
  4. current answer clearly sates with static linking of the library the same code snippet works

Requested @BrydonGibson to open the ticket with relevant information and environment details as it might be different.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM