简体   繁体   中英

Why is the use of unrelated printf statement causing changes in my program output?

I'm stuck with a program where just having a printf statement is causing changes in the output.

I have an array of n elements. For the median of every d consecutive elements, if the (d+1)th element is greater or equals to twice of it (the median), I'm incrementing the value of notifications . The complete problem statement might be referred here .

This is my program:

#include <math.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <assert.h>
#include <limits.h>
#include <stdbool.h>

#define RANGE 200

float find_median(int *freq, int *ar, int i, int d) {
    int *count = (int *)calloc(sizeof(int), RANGE + 1);
    for (int j = 0; j <= RANGE; j++) {
        count[j] = freq[j];
    }
    for (int j = 1; j <= RANGE; j++) {
        count[j] += count[j - 1];
    }
    int *arr = (int *)malloc(sizeof(int) * d);
    float median;
    for (int j = i; j < i + d; j++) {
        int index = count[ar[j]] - 1;
        arr[index] = ar[j];
        count[ar[j]]--;
        if (index == d / 2) {
            if (d % 2 == 0) {
                median = (float)(arr[index] + arr[index - 1]) / 2;
            } else {
                median = arr[index];
            }
            break;
        }
    }
    free(count);
    free(arr);
    return median;
}

int main() {
    int n, d;
    scanf("%d %d", &n, &d);
    int *arr = malloc(sizeof(int) * n);
    for (int i = 0; i < n; i++) {
        scanf("%i", &arr[i]);
    }
    int *freq = (int *)calloc(sizeof(int), RANGE + 1);
    int notifications = 0;
    if (d < n) {
        for (int i = 0; i < d; i++)
            freq[arr[i]]++;
        for (int i = 0; i < n - d; i++) {
            float median = find_median(freq, arr, i, d);   /* Count sorts the arr elements in the range i to i+d-1 and returns the median */
            if (arr[i + d] >= 2 * median) {      /* If the (i+d)th element is  greater or equals to twice the median, increments notifications*/
                printf("X");
                notifications++;
            }
            freq[arr[i]]--;
            freq[arr[i + d]]++;
        }
    }
    printf("%d", notifications);
    return 0;
}

Now, For large inputs like this , the program outputs 936 as the value of notifications whereas when I just exclude the statement printf("X") the program outputs 1027 as the value of notifications . I'm really not able to understand what is causing this behavior in my program, and what I'm missing/overseeing.

Your program has undefined behavior here:

for (int j = 0; j <= RANGE; j++) {
    count[j] += count[j - 1];
}

You should start the loop at j = 1 . As coded, you access memory before the beginning of the array count , which could cause a crash or produce an unpredictable value. Changing anything in the running environment can lead to a different behavior. As a matter of fact, even changing nothing could.

The rest of the code is more difficult to follow at a quick glance, but given the computations on index values, there may be more problems there too.

For starters, you should add some consistency checks:

  • verify the return value of scanf() to ensure proper conversions.
  • verify the values read into arr , they must be in the range 0..RANGE
  • verify that int index = count[ar[j]] - 1; never produces a negative number.
  • same for count[ar[j]]--;
  • verify that median = (float)(arr[index] + arr[index - 1]) / 2; is never evaluated with index == 0 .

Your program has undefined behavior (at several occasions). You really should be scared (and you are not scared enough).

I'm really not able to understand what is causing this behavior in my program

With UB, that question is pointless . You need to dive into implementation details (eg study the generated machine code of your program, and the code of your C compiler and standard library) to understand anything more. You probably don't want to do that (it could take years of work).

Please read as quickly as possible Lattner's blog on What Every C Programmer Should Know on Undefined Behavior

what I'm missing/overseeing.

You don't understand well enough UB. Be aware that a programming language is a specification (and code against it), not a software (eg your compiler). Program semantics is important.

As I said in comments:

  • compile with all warnings and debug info ( gcc -Wall -Wextra -g with GCC )

  • improve your code to get no warnings; perhaps try also another compiler like Clang and work to also get no warnings from it (since different compilers give different warnings).

  • consider using some version control system like git to keep various variants of your code, and some build automation tool.

  • think more about your program and invariants inside it.

  • use the debugger ( gdb ), in particular with watchpoints , to understand the internal state of your process; and have several test cases to run under the debugger and without it.

  • use instrumentation facilities such as the address sanitizer -fsanitize=address of GCC and tools like valgrind .

  • use rubber duck debugging methodology

  • sometimes consider static source code analysis tools (eg Frama-C ). They require expertise to be used, and/or give many false positives.

  • read more about programming (eg SICP ) and about the C Programming Language. Download and study the C11 programming language specification n1570 (and be very careful about every mention of UB in it). Read carefully the documentation of every standard or external function you are using. Study also the documentation of your compiler and of other tools. Handle error and failure cases (eg calloc and scanf can fail).

Debugging is difficult (eg because of the Halting Problem , of Heisenbugs , etc...) - but sometimes fun and challenging. You can spend weeks on finding one single bug. And you often cannot understand the behavior of a buggy program without diving into implementation details (studying the machine code generated by the compiler, studying the code of the compiler).

PS. Your question shows a wrong mindset -which you should improve-, and misunderstanding of UB.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM