简体   繁体   English

使用 Valgrind 运行时程序的行为不同

[英]Program behaves different when running with Valgrind

I have a problem debugging my application with Valgrind (memcheck).我在使用 Valgrind (memcheck) 调试我的应用程序时遇到问题。 It behaves differently than when I run it standalone.它的行为与我独立运行时不同。 There is a place in the code where it tries to determine the associated network interface with MAC address via an IP address.代码中有一个地方尝试通过 IP 地址确定与 MAC 地址关联的网络接口。 This is done by iterating over all existing interfaces, first using ioctl(sock, SIOCGIFADDR, &ifr) to determine the IP address and if this matches the one I am looking for, using ioctl(sock, SIOCGIFHWADDR, &ifr) to read out the MAC.这是通过迭代所有现有接口来完成的,首先使用 ioctl(sock, SIOCGIFADDR, &ifr) 来确定 IP 地址,如果这与我正在寻找的地址匹配,则使用 ioctl(sock, SIOCGIFHWADDR, &ifr) 来读出 MAC . Without Valgrind this works, with Valgrind the second call returns an empty address.如果没有 Valgrind,这可以工作,使用 Valgrind,第二个调用返回一个空地址。 Does anyone have an idea what this could be?有谁知道这可能是什么?

Here is a list of all messages that Valgrind outputs (without details):这是 Valgrind 输出的所有消息的列表(没有详细信息):

(1) ==3120== Syscall param mq_notify(notification) points to uninitialised byte(s).
(1) ==3120== Syscall param mq_notify(notification) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_notify(notification) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120== Syscall param timer_settime64(value) points to uninitialised byte(s)
(1) ==3120== Syscall param timer_settime64(value) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_notify(notification) points to uninitialised byte(s)

Here is the backtrace of the first message:这是第一条消息的回溯:

(1) ==1930== For lists of detected and suppressed errors, rerun with: -s
(1) ==1930== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
(1) ==3120== Syscall param mq_notify(notification) points to uninitialised byte(s)
(1) ==3120==    at 0xFCD3A1C: mq_notify (mq_notify.c:271)
(1) ==3120==    by 0x1016F6B7: NotificationReceiver<PisType::PisOperation>::registerInternalCallback(void (*)(sigval)) (NotificationReceiver.tpp:159)
(1) ==3120==    by 0x10170B93: NotificationReceiver<PisType::PisOperation>::setupMsgQueue(communication::CommunicationLink) (NotificationReceiver.tpp:99)
(1) ==3120==    by 0x10170CDF: NotificationReceiver<PisType::PisOperation>::NotificationReceiver(INotification::Connection) (NotificationReceiver.tpp:48)
(1) ==3120==    by 0x1016CD77: PisClientServer::createOperationReceiver() (PisClientServer.cpp:534)
(1) ==3120==    by 0x1016DA27: PisClientServer::create() (PisClientServer.cpp:419)
(1) ==3120==    by 0x1016DB67: PisClientServer::executeOperation(PisType::PisOperation) (PisClientServer.cpp:84)
(1) ==3120==    by 0x101CAFEB: StackHandler::create() (StackHandler.cpp:81)
(1) ==3120==    by 0x100A7B0B: main (stackhandler.cxx:130)
(1) ==3120==  Address 0x9ebeb5c4 is on thread 1's stack
(1) ==3120==  in frame #0, created by mq_notify (mq_notify.c:222)
(1) ==3120==  Uninitialised value was created by a stack allocation
(1) ==3120==    at 0xFCD38F0: mq_notify (mq_notify.c:222)

And the related Code:以及相关代码:

template <typename T>
void NotificationReceiver<T>::registerInternalCallback(
        internal_callback_t intCallback) {
    intCallbackParam_.commLink = commLink_;
    intCallbackParam_.msgQueue = getMsgQueue(commLink_);
    intCallbackParam_.msgQueueDescriptor = msgQueueDescriptor_;

    memset(&(intCallbackParam_.signalEvent), 0, sizeof(sigevent));
    intCallbackParam_.signalEvent.sigev_notify = SIGEV_THREAD;
    intCallbackParam_.signalEvent.sigev_value.sival_ptr = &intCallbackParam_;
    intCallbackParam_.signalEvent.sigev_notify_function = intCallback;
    intCallbackParam_.signalEvent.sigev_notify_attributes = NULL;

    mq_notify(msgQueueDescriptor_, &intCallbackParam_.signalEvent);
}

The same for the timedsend message: timedsend 消息也是如此:

(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120==    at 0xFCD3B9C: __mq_timedsend (mq_timedsend.c:28)
(1) ==3120==    by 0xFCD3B9C: mq_timedsend (mq_timedsend.c:25)
(1) ==3120==    by 0x1016F1DB: NotificationSender<PisType::PisOperationStatus>::notify(PisType::PisOperationStatus const*) (NotificationSender.tpp:72)
(1) ==3120==    by 0x1016F38F: NotificationSender<PisType::PisOperationStatus>::notifyIfReady(PisType::PisOperationStatus const*, LogFile*) (NotificationSender.tpp:99)
(1) ==3120==    by 0x1016E8FF: PisClientServer::sendOperationStatus(INotification::Connection, PisType::PisOperationStatus) (PisClientServer.cpp:352)
(1) ==3120==    by 0x1016EB17: PisClientServer::callbackStackHandler(PisType::PisOperation) (PisClientServer.cpp:567)
(1) ==3120==    by 0x1017015B: NotificationReceiver<PisType::PisOperation>::internalCallback(sigval) (NotificationReceiver.tpp:209)
(1) ==3120==    by 0xFCD370B: notification_function (mq_notify.c:105)
(1) ==3120==    by 0xFD7851B: start_thread (pthread_create.c:477)
(1) ==3120==    by 0x4354577: clone (clone.S:78)
(1) ==3120==  Address 0x5450999 is on thread 3's stack
(1) ==3120==  in frame #1, created by NotificationSender<PisType::PisOperationStatus>::notify(PisType::PisOperationStatus const*) (NotificationSender.tpp:54)
(1) ==3120==  Uninitialised value was created by a stack allocation
(1) ==3120==    at 0x1016EA68: PisClientServer::callbackStackHandler(PisType::PisOperation) (PisClientServer.cpp:551)
template <typename T>
bool NotificationSender<T>::notify(const T* data) {
    // create empty message buffer
    char msgBuffer[MQ_MSGSIZE_BYTES];
    memset(msgBuffer, 0, MQ_MSGSIZE_BYTES);

    // make sure the data size doesn't exceed the message buffer size
    int dataSizeBytes = std::min(static_cast<int>(sizeof(T)), MQ_MSGSIZE_BYTES);

    // copy the data into the buffer
    memcpy(msgBuffer, data, dataSizeBytes);

    // set a timeout so that sending a message doesn't block the sender
    // indefinitely
    struct timespec timeout;
    timeout.tv_sec = time(NULL) + MQ_SEND_TIMEOUT_SEC;
    timeout.tv_nsec = 0;

    // send message
    if (mq_timedsend(msgQueueDescriptor_, msgBuffer, dataSizeBytes, 0,
                     &timeout) != 0) {
        return false;
    }

    numMessages_++;

    return true;
}

And for settime:对于设置时间:

(1) ==3120== Syscall param timer_settime64(value) points to uninitialised byte(s)
(1) ==3120==    at 0xFCD2D60: __timer_settime64 (timer_settime.c:41)
(1) ==3120==    by 0xFCD2F47: timer_settime (timer_settime.c:81)
(1) ==3120==    by 0x102D3AAB: OS_Start_Timer (Timer.c:119)
(1) ==3120==    by 0x10306A1B: GOOSESubscriber_Enable (GOOSESubscriber.c:915)
(1) ==3120==    by 0x10268B53: IEC61850_Start (IEC61850API.c:942)
(1) ==3120==    by 0x1016A23B: PisClientServer::start() (PisClientServer.cpp:382)
(1) ==3120==    by 0x10178AC3: PisServer::start() (PisServer.cpp:235)
(1) ==3120==    by 0x1016DC4F: PisClientServer::executeOperation(PisType::PisOperation) (PisClientServer.cpp:94)
(1) ==3120==    by 0x1016EADB: PisClientServer::callbackStackHandler(PisType::PisOperation) (PisClientServer.cpp:563)
(1) ==3120==    by 0x1017015B: NotificationReceiver<PisType::PisOperation>::internalCallback(sigval) (NotificationReceiver.tpp:209)
(1) ==3120==    by 0xFCD370B: notification_function (mq_notify.c:105)
(1) ==3120==    by 0xFD7851B: start_thread (pthread_create.c:477)
(1) ==3120==  Address 0x5451170 is on thread 3's stack
(1) ==3120==  in frame #1, created by timer_settime (timer_settime.c:74)
(1) ==3120==  Uninitialised value was created by a stack allocation
(1) ==3120==    at 0xFCD2E94: timer_settime (timer_settime.c:74)
struct sigevent SignalEvent;
SignalEvent.sigev_notify = SIGEV_THREAD;
SignalEvent.sigev_notify_function = vTimeUp;
SignalEvent.sigev_value.sival_ptr = ptTimer;
SignalEvent.sigev_notify_attributes = NULL;

if(timer_create(CLOCK_MONOTONIC, &SignalEvent, &(ptTimer->tTimerID)) != 0)
{
    iReturnErrorCode = TIMER_ERROR_OS_FAILED;
}
else
{
    struct itimerspec NextTime;
    NextTime.it_value.tv_sec = u32TimeOut / 1000;
    NextTime.it_value.tv_nsec = (u32TimeOut % 1000) * 1000000;
    if(ptTimer->eType == OSTIMER_TYPE_ONESHOT)
    {
        NextTime.it_interval.tv_sec = 0;
        NextTime.it_interval.tv_nsec = 0;
    }
    else
    {
        NextTime.it_interval.tv_sec = NextTime.it_value.tv_sec;
        NextTime.it_interval.tv_nsec = NextTime.it_value.tv_nsec;
    }
    
    if(timer_settime(ptTimer->tTimerID, 0, &NextTime, NULL) != 0)
    {
        iReturnErrorCode = TIMER_ERROR_OS_FAILED;
    }
}

System/Version:系统/版本:

  • Embedded PowerPC P2020,嵌入式 PowerPC P2020,
  • Valgrind 3.17瓦尔格林 3.17
  • Yocto built Linux with 5.10 Kernel Yocto 使用 5.10 内核构建 Linux
  • powerpc-poky-linux-gcc 9.3.0 powerpc-poky-linux-gcc 9.3.0

Version 3.17 is the last one for which there is a recipe compatible with my Yocto version. 3.17 版是最后一个与我的 Yocto 版本兼容的配方。 I have not yet managed to build Valgrind from the sources.我还没有设法从源头构建 Valgrind。 But I will open a separate ticket for that.但我会为此开一张单独的票。

On my first attempts, there were several messages about missing syscall wrappers (These are also missing in the latest version).在我的第一次尝试中,有几条关于缺少系统调用包装器的消息(这些在最新版本中也缺少)。 So I copied the following lines from syswrap-ppc64-linux.c to syswrap-ppc32-linux.c:所以我将以下行从 syswrap-ppc64-linux.c 复制到 syswrap-ppc32-linux.c:

LINXY(__NR_prlimit64, sys_prlimit64), // 325
LINXY(__NR_getsockopt, sys_getsockopt), // 340
LINXY(__NR_recvmsg, sys_recvmsg), // 342

Is this sufficient or do I need to do more here?这足够还是我需要在这里做更多?

I extracted the affected piece of code and tested it separately:我提取了受影响的代码段并分别对其进行了测试:

#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <sys/ioctl.h>
#include <net/if.h>
#include <netinet/in.h>
#include <net/if_arp.h>
#include <arpa/inet.h>

int main()
{
    int iReturned = 0;
    struct ifreq ifr;
    struct ifconf ifc;
    char buf[1024];  // buffer for address 1024 should be large enough

    int sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_IP); //Get a datagram socket
    if (sock != -1)
    {
        ifc.ifc_len = sizeof(buf);
        ifc.ifc_buf = buf;
        if (ioctl(sock, SIOCGIFCONF, &ifc) != -1) //Get the IoConfiguration (an array of each adapter)
        {
            struct ifreq* it = ifc.ifc_req;
            const struct ifreq* const end = it + (ifc.ifc_len / sizeof(struct ifreq));  //Find the last point in the array

            for (; it != end; ++it) //Loop through each adapter
            {
                strcpy(ifr.ifr_name, it->ifr_name); //copy the adapter name into our local ifreq structure
                printf("Adapter name: %s \n", ifr.ifr_name );

                if (ioctl(sock,SIOCGIFADDR,&ifr)==-1) {
                    int temp_errno=errno;
                    close(sock);
                    printf("%s",strerror(temp_errno));
                }
                struct sockaddr_in* ipaddr = (struct sockaddr_in*)&ifr.ifr_addr;
                printf("IP address: %s\n",inet_ntoa(ipaddr->sin_addr));

                if (ioctl(sock,SIOCGIFHWADDR,&ifr)==-1) {
                    int temp_errno=errno;
                    close(sock);
                    printf("%s",strerror(temp_errno));
                }

                const unsigned char* mac=(unsigned char*)ifr.ifr_hwaddr.sa_data;
                printf("%02X:%02X:%02X:%02X:%02X:%02X\n",
                    mac[0],mac[1],mac[2],mac[3],mac[4],mac[5]);
            }
        }
        else
        {
            iReturned = -1;
            /* handle error */
        }

        close(sock); /*Close the opened socket*/
    }
    else
    {
        iReturned = -1;
        /* handle error */
    }

    return 0;
}

Interestingly, the determination of the MAC in this case also works with Valgrind:有趣的是,这种情况下 MAC 的确定也适用于 Valgrind:

valgrind /MacLookup 
==4099== Memcheck, a memory error detector
==4099== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4099== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==4099== Command: /MacLookup
==4099== 
Adapter name: lo 
IP address: 127.0.0.1
00:00:00:00:00:00
Adapter name: eth0 
IP address: 192.168.0.3
00:D0:93:51:A3:1B
==4099== 
==4099== HEAP SUMMARY:
==4099==     in use at exit: 0 bytes in 0 blocks
==4099==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==4099== 
==4099== All heap blocks were freed -- no leaks are possible
==4099== 
==4099== For lists of detected and suppressed errors, rerun with: -s
==4099== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

I assume this indicates that at least one of the Valgrinnd findings is genuine.我认为这表明至少有一个 Valgrinnd 的发现是真实的。

You need to determine if the errors reported are genuine or not.您需要确定报告的错误是否真实。 If there are real errors then it's quite plausible that the behaviour under Valgrind should be different.如果确实存在错误,那么 Valgrind 下的行为应该有所不同是很合理的。

I see that sigevent_t has a _pad member.我看到 sigevent_t 有一个 _pad 成员。 I don't know if that gets used internally, but memcheck might be complaining about it if is not initialized.我不知道这是否在内部使用,但如果没有初始化 memcheck 可能会抱怨它。

For the timedsend error, are all msg_len bytes of the contents of msg_ptr?对于timedsend错误,msg_ptr的内容都是msg_len字节吗?

And finally, I see that timespec64 also has a pad field that could be causing the timer_settime64.最后,我看到 timespec64 也有一个 pad 字段,它可能导致 timer_settime64。

My feeling is that the pad related errors are false positives and Valgrind needs to be improved to avoid them.我的感觉是与焊盘相关的错误是误报,需要改进 Valgrind 以避免它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM