简体   繁体   中英

C++ substring from string

I'm pretty new to C++ and I'm need to create MyString class, and its method to create new MyString object from another's substring, but chosen substring changes while class is being created and when I print it with my method.

Here is my code:

#include <iostream>
#include <cstring>

using namespace std;

class MyString {
public:
    char* str;

    MyString(char* str2create){
        str = str2create;
    }

    MyString Substr(int index2start, int length) {
        char substr[length];
        int i = 0;
        while(i < length) {
            substr[i] = str[index2start + i];
            i++;
        }
        cout<<substr<<endl; // prints normal string
        return MyString(substr);
    }

    void Print() {
        cout<<str<<endl;
    }
};

int main() {
    char str[] = {"hi, I'm a string"};
    MyString myStr = MyString(str);
    myStr.Print();

    MyString myStr1 = myStr.Substr(10, 7);
    cout<<myStr1.str<<endl;
    cout<<"here is the substring I've done:"<<endl;
    myStr1.Print();

    return 0;
}

And here is the output:

hi, I'm a string

string

stri

here is the substring I've done:

♦

Your function Substr returns the address of a local variable substr indirectly by storing a pointer to it in the return value MyString object. It's invalid to dereference a pointer to a local variable once it has gone out of scope.

I suggest you decide whether your class wraps an external string, or owns its own string data, in which case you will need to copy the input string to a member buffer.

Have to walk this through to explain what's going wrong properly so bear with me.

int main() {
    char str[] = {"hi, I'm a string"};

Allocated a temporary array of 17 characters (16 letters plus a the terminating null), placed the characters "hi, I'm a string" in it, and ended it off with a null. Temporary means what it sound like. When the function ends, str is gone. Anything pointing at str is now pointing at garbage. It may shamble on for a while and give some semblance of life before it is reused and overwritten, but really it's a zombie and can only be trusted to kill your program and eat its brains.

    MyString myStr = MyString(str);

Creates myStr, another temporary variable. Called the constructor with the array of characters. So let's take a look at the constructor:

MyString(char* str2create){
    str = str2create;
}

Take a pointer to a character, in this case it will have a pointer to the first element of main's str . This pointer will be assigned to MyString's str . There is no copying of the "hi, I'm a string". Both mains's str and MyString's str point to the same place in memory. This is a dangerous condition because alterations to one will affect the other. If one str goes away, so goes the other. If one str is overwritten, so too is the other.

What the constructor should do is:

MyString(char* str2create){
    size_t len = strlen(str2create); // 
    str = new char[len+1]; // create appropriately sized buffer to hold string
                           // +1 to hold the null
    strcpy(str, str2create); // copy source string to MyString
}

A few caveats: This is really really easy to break. Pass in a str2create that never ends, for example, and the strlen will go spinning off into unassigned memory and the results will be unpredictable.

For now we'll assume no one is being particularly malicious and will only enter good values, but this has been shown to be really bad assumption in the real world.

This also forces a requirement for a destructor to release the memory used by str

virtual ~MyString(){
    delete[] str;
}

It also adds a requirement for copy and move constructors and copy and move assignment operators to avoid violating the Rule of Three/Five .

Back to OP's Code...

str and myStr point at the same place in memory, but this isn't bad yet. Because this program is a trivial one, it never becomes a problem. myStr and str both expire at the same point and neither are modified again.

myStr.Print();

Will print correctly because nothing has changed in str or myStr .

    MyString myStr1 = myStr.Substr(10, 7);

Requires us to look at MyString::Substr to see what happens.

MyString Substr(int index2start, int length) {
    char substr[length];

Creates a temporary character array of size length. First off, this is non-standard C++. It won't compile under a lot of compilers, do just don't do this in the first place. Second, it's temporary. When the function ends, this value is garbage. Don't take any pointers to substr because it won't be around long enough to use them. Third, no space was reserved for the terminating null. This string will be a buffer overrun waiting to happen.

    int i = 0;
    while(i < length) {
        substr[i] = str[index2start + i];
        i++;
    }

OK that's pretty good. Copy from source to destination. What it is missing is the null termination so users of the char array knows when it ends.

    cout<<substr<<endl; // prints normal string

And that buffer overrun waiting to happen? Just happened. Whups. You got unlucky because it looks like it worked rather than crashing and letting you know that it didn't. Must have been a null in memory at exactly the right place.

    return MyString(substr);

And this created a new MyString that points to substr . Right before substr hit the end of the function and died. This new MyString points to garbage almost instantly.

}

What Substr should do:

MyString Substr(int index2start, int length)
{
    std::unique_ptr<char[]> substr(new char[length + 1]);
    // unique_ptr is probably paranoid overkill, but if something does go 
    // wrong, the array's destruction is virtually guaranteed
    int i = 0;
    while (i < length)
    {
        substr[i] = str[index2start + i];
        i++;
    }
    substr[length] = '\0';// null terminate
    cout<<substr.get()<<endl; // get() gets the array out of the unique_ptr
    return MyString(substr.get()); // google "copy elision" for more information 
                                   // on this line.
}

Back in OP's code, with the return to the main function that which was substr starts to be reused and overwritten.

cout<<myStr1.str<<endl;

Prints myStr1.str and already we can see some of it has been reused and destroyed.

cout<<"here is the substring I've done:"<<endl;
myStr1.Print();

More death, more destruction, less string.

Things to not do in the future:

Sharing pointers where data should have been copied.

Pointers to temporary data.

Not null terminating strings.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM