简体   繁体   English

std :: string_view文字是否保证以null结尾?

[英]Is a std::string_view literal guaranteed to be null-terminated?

I know that a trivial std::string_view is not guaranteed to be null-terminated. 我知道一个简单的std::string_view不能保证以null结尾。 However, I don't know if a std::string_view literal is guaranteed to be null-terminated. 但是,我不知道std::string_view文字是否保证以null结尾。

For example: 例如:

#include <string_view>

using namespace std::literals;

int main()
{
    auto my_sv = "hello"sv;
}

Does C++17 or later guarantee that my_sv.data() is null-terminated? C ++ 17或更高版本是否保证my_sv.data()以空值终止?

=== Below is updated === ===以下更新===

All of below are from n4820 : 以下全部来自n4820

  1. As per 5.13.5.14, a string literal is null-terminated. 根据5.13.5.14,字符串文字以空值终止。
  2. As per 5.13.8, a user-defined-string-literal is composed of a string literal plus a custom suffix. 根据5.13.8,用户定义的字符串文字由字符串文字加上自定义后缀组成。 Say, "hello"sv , hello is the string literal, sv is the suffix. 说, "hello"svhello是字符串文字, sv是后缀。
  3. As per 5.13.8.5, "hello"sv is treated as a call of the form operator "" sv(str, len); 根据5.13.8.5, "hello"sv被视为表单operator "" sv(str, len);的调用operator "" sv(str, len); as per 5.13.5.14, str is null-terminated. 根据5.13.5.14, str是以null结尾的。
  4. As per 21.4.2.1, sv 's data() must return str . 根据21.4.2.1, svdata()必须返回str

Can they prove that "hello"sv.data() is guarantteed to be null-terminated by the C++ standard? 他们能否证明"hello"sv.data()是否被C ++标准保证为空终止?

So let's get the simple parts out of the way. 所以让我们把简单的部分放在一边。 No string_view is ever "NUL-terminated", in the sense that the object represents a sized range of characters. 没有string_view永远是“NUL终止的”,在这个意义上,对象代表一个大小的字符范围。 Even if you create a string_view from a NUL-terminated sequence of characters, the string_view itself is still not "NUL-terminated". 即使您从NUL终止的字符序列创建string_viewstring_view 本身仍然不是“NUL终止”。

The question you're really asking is this: does the implementation have some leeway to make the statement "some literal"sv yield a string_view whose data member does not point into the NUL-terminated string literal represented by "some literal" ? 您真正要问的问题是:实现是否有一些余地使语句"some literal"sv产生一个string_viewdata成员指向由"some literal"表示的以NUL结尾的字符串文字? That is, is this: 就是这样:

string_view s = "some literal"sv;

permitted to behave in any way differently from this: 允许以任何不同的方式行事:

const char *lit = "some literal";
string_view s(lit, <number of chars in of lit>);

In the latter case, s.data() is guaranteed to be a pointer to the string literal, and thus you could treat that pointer as a pointer to a NUL-terminated string. 在后一种情况下, s.data()保证是指向字符串文字的指针,因此您可以将该指针视为指向NUL终止字符串的指针。 You're asking if the former is just as valid. 你问的是前者是否同样有效。

Let's investigate. 我们来调查吧。 The definition for the operator""sv overloads are stated to be : operator""sv重载定义声明为

 constexpr string_view operator""sv(const char* str, size_t len) noexcept; 

Returns: string_view{str, len} . 返回: string_view{str, len}

That is the standard specification for the behavior of this function: it returns a string_view which points into the memory supplied by str . 这是该函数行为的标准规范:它返回一个string_view ,它指向str提供的内存。 Therefore, the implementation cannot allocate some hidden memory and use that or whatever; 因此,实现不能分配一些隐藏的内存并使用它或其他任何东西; the returned string_view::data is required to return the same pointer as str . 返回的string_view::data 需要返回与str相同的指针。

Now, this brings us to a different question: is str required to be a NUL-terminated string? 现在,这给我们带来了一个不同的问题: str 需要是一个以NUL结尾的字符串? That is, is it legal for a compiler to sees that you are using the sv UDL implementation and therefore remove the NUL character from the array it was going to create for the string literal passed as str ? 也就是说,编译器看到你正在使用sv UDL实现是否合法,因此从它要为str传递的字符串文字创建的数组中删除NUL字符?

Let's look at how UDLs for strings work : 让我们看看字符串的UDL如何工作

If L is a user-defined-string-literal, let str be the literal without its ud-suffix and let len be the number of code units in str (ie, its length excluding the terminating null character ). 如果L是用户定义的字符串文字,则令str没有其ud后缀文字,并且lenstr的代码单元数(即,其长度不包括终止空字符 )。 The literal L is treated as a call of the form 文字L被视为表格的调用

 operator "" X(str, len) 

Note the phrases I emphasized. 请注意我强调的短语。 We know the behavior of "the literal without its ud-suffix". 我们知道“没有ud后缀的文字”的行为。 And the second phrase makes specific mention of the expected NUL terminator for str . 第二个短语特别提到了str的预期NUL终结符。 I'd say that's a pretty clear statement that str will be given a literal string. 我会说这是一个非常明确的声明, str将被赋予一个字符串。 And that literal string will be built in accord with regular string literal rules in C++, and therefore will be NUL-terminated. 并且该文字字符串将根据C ++中的常规字符串文字规则构建,因此将以NUL终止。

Given the above, I think it is safe to say that there is no wiggle room for the implementation here. 鉴于上述情况,我认为可以肯定地说,这里的实施没有余地。 The string_view returned by the UDL must point to the array defined by the string literal specified in the UDL, and like any other string literal, that array will be NUL-terminated. UDL返回的string_view 必须指向由UDL中指定的字符串文字定义的数组,并且与任何其他字符串文字一样,该数组将以 NUL终止。

That having been said, please review my first paragraph. 话虽如此, 查看我的第一段。 You should not write any code which assumes that a string_view is NUL-terminated. 您不应该编写任何假定string_view以NUL终止的代码。 I would call it a code smell even if the creator of the string_view and is consumer are right next to each other. 即使string_view的创建者和消费者彼此相邻,我也会称它为代码味道。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM