简体   繁体   English

我可以确定char *参数解决了多少有效内存?

[英]Can I determine how much valid memory is addressed by a char * argument?

I have a function like: 我有一个功能:

// string is a null-terminated char array. Replace all a in the string with b
void ReplaceCharInString(char *string, char a, char b)
{
// loop over the string char by char, to find all "a"s and replace them with "b"
}

I'm doing the defensive programming. 我在做防守编程。 The problem is the implementation replies on the client to really pass in an array of chars. 问题是客户端上的实现回复真正传递了一系列字符。 If an address of a single char is passed in, the program definitely runs into a bad state(probably crashes). 如果传入单个char的地址,程序肯定会遇到错误状态(可能崩溃)。 How do I check and avoid this? 我该如何检查并避免这种情况? (I know if I pass in the std::string object, the problem goes away of course) (我知道如果我传入std :: string对象,问题就会消失)

No, you cannot check this and have to trust the user of the function to pass an actual correctly null-terminated string. 不,你不能检查这一点,并且必须信任该函数的用户传递一个实际正确的以null结尾的字符串。

This is also the way all the C standard library functions like strcpy() , strlen() , ... work. 这也是所有C标准库函数如strcpy()strlen() ,...工作的方式。

No you can't check if you're running out of the allocated memory if the string is not null terminated. 如果字符串未终止,则无法检查是否已用完已分配的内存。
As sth said C standard library functions like strcpy(), strlen(), also rely on the fact that the string is valid (null terminated). 正如sth所说,标准库函数如strcpy(),strlen(),也依赖于字符串有效(null终止)的事实。

... however, one solution could be Mudflap. ......但是,一个解决方案可能是Mudflap。 It is costly (in term of performance ) and is only valid with GCC . 这是昂贵的 (在性能方面 )并且仅对GCC有效
Mudflap is a library that instruments all pointer/array operations. Mudflap是一个监视所有指针/数组操作的库。 With this you will be able to check if a specific location is valid memory or not. 有了这个,您将能够检查特定位置是否是有效的内存。

In fact the only reason I see for using Mudflap is if security is a very big issue for your application. 事实上,我看到使用Mudflap的唯一原因是安全性对于您的应用程序来说是一个非常大的问题。 But even in this case GCC provides a better alternative against buffer overflow (see -fstack-protector[-all]). 但即使在这种情况下,GCC也提供了一种更好的缓冲区溢出替代方案(参见-fstack-protector [-all])。

最好是使用std :: string。

You really can't trust user-inputted data at all. 你真的不能完全信任用户输入的数据。 Using std::string is an improvement, since it prevents buffer overflows, but it's still not entirely safe. 使用std::string是一种改进,因为它可以防止缓冲区溢出,但它仍然不是完全安全的。 What's to stop the user from inputting a 10GB string that consumes all of your memory? 什么阻止用户输入消耗所有内存的10GB字符串?

You really need to validate the input when you first receive it from the user, (whether it's coming from a socket, or through stdin.) For example, you can enforce a maximum number of bytes to read, or ensure that all input characters are in the ASCII range. 当你第一次从用户那里收到输入时,你确实需要验证输入(无论是来自套接字,还是通过stdin。)例如,你可以强制执行最大字节数来读取,或者确保所有输入字符都是在ASCII范围内。

If you have to use C-strings, you can memset the input buffer to 0, and make sure the last character in the buffer is always a NULL byte, to ensure that your string is properly null-terminated. 如果必须使用C字符串,则可以将输入缓冲区memset设置为0,并确保缓冲区中的最后一个字符始终为NULL字节,以确保字符串正确地以空值终止。

You avoid this by documenting your code properly. 您可以通过正确记录代码来避免这种情况。 Nobody but a complete beginner at C (who should not be writing serious code) would ever take the address of a single char variable (or even declare such a variable), so your fear is not very relevant. 除了C的完全初学者(谁不应该编写严肃的代码)之外,没有人会接受单个char变量的地址(甚至声明这样的变量),所以你的恐惧不是很相关。

One way in C99 to partly guard against this is to declare the function as: C99部分防范这种情况的一种方法是将函数声明为:

void ReplaceCharInString(char string[static 2], char a, char b);

This means that string must be a valid pointer to at least 2 chars, which would protect against the single-char-pointer issue you mentioned at the cost of making the function fail to support the empty string. 这意味着string必须是指向至少2个字符的有效指针,这将防止您提到的单字符指针问题,代价是使函数无法支持空字符串。 But it still cannot protect you from incorrect malloc , etc. 但它仍然无法保护您免受错误的malloc等的影响。

Writing safe C is really a matter of proper code documentation and auditing, and firing incompetent people (or better yet never hiring them to begin with). 编写安全的C实际上是一个正确的代码文档和审计问题,并解雇无能的人(或更好,但从来没有雇用他们开始)。 The language is designed to be light and efficient. 该语言设计轻巧高效。 If you go trying to put your own huge safety harness layers on it, you'll probably do a worse job than the people who designed higher-level languages, and then you might as well have just used a higher-level language to begin with. 如果你试图在它上面放置你自己的巨大安全带层,你可能会比设计更高级语言的人做得更糟糕,然后你可能刚刚使用了更高级别的语言来开始。

With C++, you should both use function overloading and make sure no one will send you the address of a char. 使用C ++,您应该使用函数重载并确保没有人会向您发送char的地址。

void ReplaceCharInString(std::string & p_string, char p_a, char p_b)
{
   for(size_t i = 0, iMax = p_string.size(); i < iMax; ++i)
   {
      // replace if needed : p_string[i]
   }
}

void ReplaceCharInString(char & p_char, char p_a, char p_b)
{
   // replace if needed : p_char
}

template<size_t size>
void ReplaceCharInString(char (& p_string)[size], char p_a, char p_b) ;
{
   for(size_t i = 0; i < size; ++i)
   {
      // replace if needed : p_string[i]
   }
}

With only those three functions (and no function taking char * as its first parameter), no one will be able to call them with a raw char * : 只有这三个函数(并且没有函数将char *作为其第一个参数),没有人能够使用原始char *调用它们:

void foo()
{
   char         a[4] ;
   std::string  b  ;
   char         c  ;
   char *       d ;

   // initialize a, b, c and d to some values

   ReplaceCharInString(a, 'x', 'z') ; // compiles
   ReplaceCharInString(b, 'x', 'z') ; // compiles
   ReplaceCharInString(c, 'x', 'z') ; // compiles
   ReplaceCharInString(d, 'x', 'z') ; // COMPILATION ERROR
}

So there's no way someone will call your function with the address of a single char. 因此,没有人会用一个char的地址调用你的函数。

Anyway, in C++, there are very few reasons to use char * as a string (you would use instead a std::string ), and even less a char * as an address to a single char (you would use instead a char & ), so no one will complain about the lack of support for char * in the functions above. 无论如何,在C ++中,使用char *作为字符串的原因很少(你可以使用std::string ),甚至更少的char *作为单个char的地址(你可以使用char & ),所以没有人会抱怨上述函数中缺少对char *的支持。

The function just gets a pointer to somewhere in memory. 该函数只是获取指向内存中某处的指针。 It's defensive programming practice to check that it is not NULL. 检查它不是NULL是防御性编程实践。 This is a usually value if the caller has "forgotten" to allocate the memory or some allocation failed. 如果调用者“忘记”分配内存或某些分配失败,这通常是一个值。

There are more checks for invalid char arrays possible, but these are platform dependent. 可能会对无效的char数组进行更多检查,但这些检查与平台有关。

At least you cannot distinguish between pointer to char array and pointer to char. 至少你无法区分char数组的指针和char的指针。

Your prototype is wrong. 你的原型是错的。 When you pass an array to a function, you always must pass the length of the array since inside the function you never know the size of the array (it is always the size of the pointer). 将数组传递给函数时,总是必须传递数组的长度,因为在函数内部你永远不知道数组的大小(它总是指针的大小)。 So you should declare an int variable for the size of the array so that the declaration itself makes explicit the use of an array. 因此,您应该为数组的大小声明一个int变量,以便声明本身明确使用数组。

void ReplaceCharInString(char *string, int len, char a, char b)

This way you do not depend on reading up to NULL character of string input arg, and just use the length. 这样你就不依赖于读取字符串输入arg的NULL字符,只需使用长度。 If you can not do this, and you have to keep the declaration as is, then you should have an explicit contract on what the users of the function should do, to not have problems.Then the responsibility is delegated to the caller of the function to honor the contract 如果你不能这样做,并且你必须按原样保留声明,那么你应该有一个明确的合同,关于函数的用户应该做什么,没有问题。然后责任被委托给函数的调用者。履行合同

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM