简体   繁体   中英

Clang 13 -O2 produces weird output while gcc does not

Can someone explain to me why the following code gets optimized strangely with clang 13 with the -O2 flag? Using lower optimizations settings with clang and with all optimization settings of gcc I get the expected printed output of "John: 5", however, with clang -O2 or greater optimization flags I get an output of ": 5." Does my code have undefined behavior that I am not aware of? Strangely enough, if I compile the code with -fsanitize=undefined, the code will work as expected. How should I even go about trying to diagnose an issue like this? Any help is greatly appreciated.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

typedef size_t usize;

typedef struct String {
    char *s;
    usize len;
} String;

String string_new(void) {
    String string;
    char *temp = malloc(1);
    if (temp == NULL) {
        printf("Failed to allocate memory in \"string_new()\".\n");
        exit(-1);
    }
    string.s = temp;
    string.s[0] = 0;
    string.len = 1;
    return string;
}

String string_from(char *s) {
    String string = string_new();
    string.s = s;
    string.len = strlen(s);
    return string;
}

void string_push_char(String *self, char c) {
    self->len = self->len + 1;
    char *temp = realloc(self->s, self->len);
    if (temp == NULL) {
        printf("Failed to allocate memory in \"string_push_char()\".\n");
        exit(-1);
    }
    self->s[self->len - 2] = c;
    self->s[self->len - 1] = 0;
}

void string_free(String *self) {
    free(self->s);
}

int main(void) {
    String name = string_new();
    string_push_char(&name, 'J');
    string_push_char(&name, 'o');
    string_push_char(&name, 'h');
    string_push_char(&name, 'n');

    printf("%s: %lu\n", name.s, name.len);

    string_free(&name);

    return 0;
}

Your string_push_char calls realloc but then continues to use the old pointer. This will usually go well if reallocation happens in place, but of course it's undefined behavior if the memory block gets moved.

However, Clang has a ( controversial ) optimization where it assumes that the pointer passed to realloc always becomes invalid, because you're supposed to use the returned pointer instead.

The solution is to assign temp back to self->s after the null check.

As a side note, your string_from is so completely broken that you should remove it and rethink it from scratch.

I would do it a bit different way.

typedef size_t usize;

typedef struct String 
{
    usize len;
    char str[];
} String;


String *string_from(char *s) 
{
    usize size = strlen(s);
    String *string = malloc(sizeof(*string) + size + 1);
    if(string)
    {
        string -> len = size + 1; //including null character
        strcpy(string -> str, s);
    }
    return string;
}

String *string_push_char(String *self, char c) {
    usize len = self ? self->len : 1;

    self = realloc(self, len + 1);
    if(self)
    {
        self -> len = len + 1;
        self -> str[self -> len - 2] = c; 
        self -> str[self -> len - 1] = 0; 
    }
    return self;
}

void string_free(String *self) {
    free(self);
}

int main(void) {
    String *str = NULL;
    /* add some allocation checks same as with realloc function (temp pointer etc) */
    str = string_push_char(str, 'J');
    str = string_push_char(str, 'o');
    str = string_push_char(str, 'h');
    str = string_push_char(str, 'n');

    printf("%s: %zu\n", str -> str, str -> len);

    string_free(str);

    return 0;
}

https://godbolt.org/z/4ardvGcxa

In your code you have plenty issues:

String string_from(char *s) {
    String string = string_new();
    string.s = s;
    string.len = strlen(s);
    return string;
}

This function will instantly create a memory leak and will assign the (very likely) not reallocable (and possible not modifiable ) memory block to the struct which later you may try to realloc.

In addition to the answer by @Sebastian Redl, I can add that the code has undefined behavior as per C17 7.22.3.5:

The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size.

This is one of the things that were poorly specified in C90 and silently clarified in C99. From the C99 rationale V5.10 7.20.3.4:

A new feature of C99 : the realloc function was changed to make it clear that the pointed-to object is deallocated, a new object is allocated, and the content of the new object is the same as that of the old object up to the lesser of the two sizes. C89 attempted to specify that the new object was the same object as the old object but might have a different address. This conflicts with other parts of the Standard that assume that the address of an object is constant during its lifetime. Also, implementations that support an actual allocation when the size is zero do not necessarily return a null pointer for this case. C89 appeared to require a null return value, and the Committee felt that this was too restrictive.

Notably clang -O3 -std=c90 -pedantic-errors still crashes, so this code never worked in clang with any C version.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM