简体   繁体   中英

C++ toupper Syntax

I've just been introduced to toupper, and I'm a little confused by the syntax; it seems like it's repeating itself. What I've been using it for is for every character of a string, it converts the character into an uppercase character if possible.

for (int i = 0; i < string.length(); i++)
{
    if (isalpha(string[i]))
    {
        if (islower(string[i]))
        {
            string[i] = toupper(string[i]);
        }
    }
}

Why do you have to list string[i] twice? Shouldn't this work? toupper(string[i]); (I tried it, so I know it doesn't.)

toupper is a function that takes its argument by value. It could have been defined to take a reference to character and modify it in-place, but that would have made it more awkward to write code that just examines the upper-case variant of a character, as in this example:

// compare chars case-insensitively without modifying anything
if (std::toupper(*s1++) == std::toupper(*s2++))
  ...

In other words, toupper(c) doesn't change c for the same reasons that sin(x) doesn't change x .


To avoid repeating expressions like string[i] on the left and right side of the assignment, take a reference to a character and use it to read and write to the string:

for (size_t i = 0; i < string.length(); i++) {
  char& c = string[i];  // reference to character inside string
  c = std::toupper(c);
}

Using range-based for , the above can be written more briefly (and executed more efficiently) as:

for (auto& c: string)
    c = std::toupper(c);

As from the documentation , the character is passed by value.
Because of that, the answer is no, it shouldn't .

The prototype of toupper is:

int toupper( int ch );

As you can see, the character is passed by value, transformed and returned by value.
If you don't assign the returned value to a variable, it will be definitely lost.
That's why in your example it is reassigned so that to replace the original one.

As many of the other answers already say, the argument to std::toupper is passed and the result returned by-value which makes sense because otherwise, you wouldn't be able to call, say std::toupper('a') . You cannot modify the literal 'a' in-place. It is also likely that you have your input in a read-only buffer and want to store the uppercase-output in another buffer. So the by-value approach is much more flexible.

What is redundant, on the other hand, is your checking for isalpha and islower . If the character is not a lower-case alphabetic character, toupper will leave it alone anyway so the logic reduces to this.

#include <cctype>
#include <iostream>

int
main()
{
  char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
  for (auto s = text; *s != '\0'; ++s)
    *s = std::toupper(*s);
  std::cout << text << '\n';
}

You could further eliminate the raw loop by using an algorithm , if you find this prettier.

#include <algorithm>
#include <cctype>
#include <iostream>
#include <utility>

int
main()
{
  char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
  std::transform(std::cbegin(text), std::cend(text), std::begin(text),
                 [](auto c){ return std::toupper(c); });
  std::cout << text << '\n';
}

toupper takes an int by value and returns the int value of the char of that uppercase character. Every time a function doesn't take a pointer or reference as a parameter the parameter will be passed by value which means that there is no possible way to see the changes from outside the function because the parameter will actually be a copy of the variable passed to the function, the way you catch the changes is by saving what the function returns. In this case, the character upper-cased.

Note that there is a nasty gotcha in isalpha(), which is the following: the function only works correctly for inputs in the range 0-255 + EOF.

So what, you think.

Well, if your char type happens to be signed, and you pass a value greater than 127, this is considered a negative value, and thus the int passed to isalpha will also be negative (and thus outside the range of 0-255 + EOF).

In Visual Studio, this will crash your application. I have complained about this to Microsoft, on the grounds that a character classification function that is not safe for all inputs is basically pointless, but received an answer stating that this was entirely standards conforming and I should just write better code. Ok, fair enough, but nowhere else in the standard does anyone care about whether char is signed or unsigned. Only in the isxxx functions does it serve as a landmine that could easily make it through testing without anyone noticing.

The following code crashes Visual Studio 2015 (and, as far as I know, all earlier versions):

int x = toupper ('é'); 

So not only is the isalpha() in your code redundant, it is in fact actively harmful, as it will cause any strings that contain characters with values greater than 127 to crash your application.

See http://en.cppreference.com/w/cpp/string/byte/isalpha : "The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF."

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM