[英]Is a std::string_view literal guaranteed to be null-terminated?
I know that a trivial std::string_view
is not guaranteed to be null-terminated. 我知道一个简单的
std::string_view
不能保证以null结尾。 However, I don't know if a std::string_view
literal is guaranteed to be null-terminated. 但是,我不知道
std::string_view
文字是否保证以null结尾。
For example: 例如:
#include <string_view>
using namespace std::literals;
int main()
{
auto my_sv = "hello"sv;
}
Does C++17 or later guarantee that my_sv.data()
is null-terminated? C ++ 17或更高版本是否保证
my_sv.data()
以空值终止?
=== Below is updated === ===以下更新===
All of below are from n4820 : 以下全部来自n4820 :
- As per 5.13.5.14, a string literal is null-terminated.
根据5.13.5.14,字符串文字以空值终止。
- As per 5.13.8, a user-defined-string-literal is composed of a string literal plus a custom suffix.
根据5.13.8,用户定义的字符串文字由字符串文字加上自定义后缀组成。 Say,
"hello"sv
,hello
is the string literal,sv
is the suffix.说,
"hello"sv
,hello
是字符串文字,sv
是后缀。- As per 5.13.8.5,
"hello"sv
is treated as a call of the formoperator "" sv(str, len);
根据5.13.8.5,
"hello"sv
被视为表单operator "" sv(str, len);
的调用operator "" sv(str, len);
as per 5.13.5.14,str
is null-terminated.根据5.13.5.14,
str
是以null结尾的。- As per 21.4.2.1,
sv
'sdata()
must returnstr
.根据21.4.2.1,
sv
的data()
必须返回str
。
Can they prove that "hello"sv.data()
is guarantteed to be null-terminated by the C++ standard? 他们能否证明
"hello"sv.data()
是否被C ++标准保证为空终止?
So let's get the simple parts out of the way. 所以让我们把简单的部分放在一边。 No
string_view
is ever "NUL-terminated", in the sense that the object represents a sized range of characters. 没有
string_view
永远是“NUL终止的”,在这个意义上,对象代表一个大小的字符范围。 Even if you create a string_view
from a NUL-terminated sequence of characters, the string_view
itself is still not "NUL-terminated". 即使您从NUL终止的字符序列创建
string_view
, string_view
本身仍然不是“NUL终止”。
The question you're really asking is this: does the implementation have some leeway to make the statement "some literal"sv
yield a string_view
whose data
member does not point into the NUL-terminated string literal represented by "some literal"
? 您真正要问的问题是:实现是否有一些余地使语句
"some literal"sv
产生一个string_view
其data
成员不指向由"some literal"
表示的以NUL结尾的字符串文字? That is, is this: 就是这样:
string_view s = "some literal"sv;
permitted to behave in any way differently from this: 允许以任何不同的方式行事:
const char *lit = "some literal";
string_view s(lit, <number of chars in of lit>);
In the latter case, s.data()
is guaranteed to be a pointer to the string literal, and thus you could treat that pointer as a pointer to a NUL-terminated string. 在后一种情况下,
s.data()
保证是指向字符串文字的指针,因此您可以将该指针视为指向NUL终止字符串的指针。 You're asking if the former is just as valid. 你问的是前者是否同样有效。
Let's investigate. 我们来调查吧。 The definition for the
operator""sv
overloads are stated to be : operator""sv
重载的定义声明为 :
constexpr string_view operator""sv(const char* str, size_t len) noexcept;
Returns:
string_view{str, len}
.返回:
string_view{str, len}
。
That is the standard specification for the behavior of this function: it returns a string_view
which points into the memory supplied by str
. 这是该函数行为的标准规范:它返回一个
string_view
,它指向str
提供的内存。 Therefore, the implementation cannot allocate some hidden memory and use that or whatever; 因此,实现不能分配一些隐藏的内存并使用它或其他任何东西; the returned
string_view::data
is required to return the same pointer as str
. 返回的
string_view::data
需要返回与str
相同的指针。
Now, this brings us to a different question: is str
required to be a NUL-terminated string? 现在,这给我们带来了一个不同的问题:
str
需要是一个以NUL结尾的字符串? That is, is it legal for a compiler to sees that you are using the sv
UDL implementation and therefore remove the NUL character from the array it was going to create for the string literal passed as str
? 也就是说,编译器看到你正在使用
sv
UDL实现是否合法,因此从它要为str
传递的字符串文字创建的数组中删除NUL字符?
Let's look at how UDLs for strings work : 让我们看看字符串的UDL如何工作 :
If
L
is a user-defined-string-literal, letstr
be the literal without its ud-suffix and letlen
be the number of code units instr
(ie, its length excluding the terminating null character ).如果
L
是用户定义的字符串文字,则令str
为没有其ud后缀的文字,并且len
为str
的代码单元数(即,其长度不包括终止空字符 )。 The literalL
is treated as a call of the form文字
L
被视为表格的调用operator "" X(str, len)
Note the phrases I emphasized. 请注意我强调的短语。 We know the behavior of "the literal without its ud-suffix".
我们知道“没有ud后缀的文字”的行为。 And the second phrase makes specific mention of the expected NUL terminator for
str
. 第二个短语特别提到了
str
的预期NUL终结符。 I'd say that's a pretty clear statement that str
will be given a literal string. 我会说这是一个非常明确的声明,
str
将被赋予一个字符串。 And that literal string will be built in accord with regular string literal rules in C++, and therefore will be NUL-terminated. 并且该文字字符串将根据C ++中的常规字符串文字规则构建,因此将以NUL终止。
Given the above, I think it is safe to say that there is no wiggle room for the implementation here. 鉴于上述情况,我认为可以肯定地说,这里的实施没有余地。 The
string_view
returned by the UDL must point to the array defined by the string literal specified in the UDL, and like any other string literal, that array will be NUL-terminated. UDL返回的
string_view
必须指向由UDL中指定的字符串文字定义的数组,并且与任何其他字符串文字一样,该数组将以 NUL终止。
That having been said, please review my first paragraph. 话虽如此, 请查看我的第一段。 You should not write any code which assumes that a
string_view
is NUL-terminated. 您不应该编写任何假定
string_view
以NUL终止的代码。 I would call it a code smell even if the creator of the string_view
and is consumer are right next to each other. 即使
string_view
的创建者和消费者彼此相邻,我也会称它为代码味道。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.