简体   繁体   中英

Concatenating string_view objects

I've been adding std::string_view s to some old code for representing string like config params, as it provides a read only view, which is faster due to no need for copying.

However, one cannot concatenate two string_view together as the operator+ isn't defined. I see this question has a couple answers stating its an oversight and there is a proposal in for adding that in. However, that is for adding a string and a string_view , presumably if that gets implemented, the resulting concatenation would be a std::string

Would adding two string_view also fall in the same category? And if not, why shouldn't adding two string_view be supported?

Sample

std::string_view s1{"concate"};
std::string_view s2{"nate"};
std::string_view s3{s1 + s2};

And here's the error

error: no match for 'operator+' (operand types are 'std::string_view' {aka 'std::basic_string_view<char>'} and 'std::string_view' {aka 'std::basic_string_view<char>'})

A std::string_view is an alias for std::basic_string_view<char> , which is a std::basic_string_view templated on a specific type of character, ie char .

But what does it look like?

Beside the fairly large number of useful member functions such as find , substr , and others (maybe it's an ordinary number, if compared to other container/string-like things offered by the STL), std::basic_string_view<_CharT> , with _CharT being the generic char -like type, has just 2 data members,

// directly from my /usr/include/c++/12.2.0/string_view
      size_t        _M_len;
      const _CharT* _M_str;

ie a const ant pointer to _CharT to indicate where the view starts, and a size_t (an appropriate type of number) to indicate how long the view is starting from _M_str 's pointee.

In other words, a string view just knows where it starts and how long it is, so it represents a sequence of char -like entities which are consecutive in memory . With just two such memebrs, you can't represent a string which is made up of non-contiguous substrings.

Yet in other words, if you want to create a std::string_view , you need to be able to tell how many char s it is long and from which position. Can you tell where s1 + s2 would have to start and how many characters it should be long? Think about it: you can't, becase s1 and s2 are not adjacent.

Maybe a diagram can help.

Assume these lines of code

std::string s1{"hello"};
std::string s2{"world"};

s1 and s2 are totally unrelated objects, as far as their memory location is concerned; here is what they looks like:

                           &s2[0]
                             |
                             | &s2[1]
                             |   |
&s1[0]                       |   | &s2[2]
  |                          |   |   |
  | &s1[1]                   |   |   | &s2[3]
  |   |                      |   |   |   |
  |   | &s1[2]               |   |   |   | &s2[4]
  |   |   |                  |   |   |   |   |
  |   |   | &s1[3]           v   v   v   v   v
  |   |   |   |            +---+---+---+---+---+
  |   |   |   | &s1[4]     | w | o | r | l | d |
  |   |   |   |   |        +---+---+---+---+---+
  v   v   v   v   v
+---+---+---+---+---+
| h | e | l | l | o |
+---+---+---+---+---+

I've intentionally drawn them misaligned to mean that &s1[0] , the memory location where s1 starts, and &s2[0] , the memory location where s2 starts, have nothing to do with each other.

Now, imagine you create two string views like this:

std::string_view sv1{s1};
std::string_view sv2(s2.begin() + 1, s2.begin() + 4);

Here's what they will look like, in terms of the two implementation-defined members _M_str and _M_len :

                                &s2[0]
                                  |
                                  | &s2[1]
                                  |   |
     &s1[0]                       |   | &s2[2]
       |                          |   |   |
       | &s1[1]                   |   |   | &s2[3]
       |   |                      |   |   |   |
       |   | &s1[2]               |   |   |   | &s2[4]
       |   |   |                  |   |   |   |   |
       |   |   | &s1[3]           v   v   v   v   v
       |   |   |   |            +---+---+---+---+---+
       |   |   |   | &s1[4]     | w | o | r | l | d |
       |   |   |   |   |        +---+---+---+---+---+
       v   v   v   v   v            · ^         ·
     +---+---+---+---+---+          · |         ·
     | h | e | l | l | o |        +---+         ·
     +---+---+---+---+---+        | ·           ·
     · ^                 ·        | · s2._M_len ·
     · |                 ·        | <----------->
   +---+                 ·        |
   | ·                   ·        +-- s2._M_str
   | ·       s1._M_len   ·
   | <------------------->
   |
   +-------- s1._M_str

Given the above, can you see what's wrong with expecting that

std::string_view s3{s1 + s2};

works?

How can you possible define s3._M_str and s3._M_len (based on s1._M_str , s1._M_len , s2._M_str , and s2._M_len ), such that they represent a view on "helloworld" ?

You can't because "hello" and "world" are located in two unrelated areas of memory.

A view is similar to a span in that it does not own the data, as the name implies it is just a view of the data. To concatenate the string views you'd first need to construct a std::string then you can concatenate.

std::string s3 = std::string(s1) + std::string(s2);

Note that s3 will be a std::string not a std::string_view since it would own this data.

std::string_view does not own any data, it is only a view . If you want to join two views to get a joined view, you can use boost::join() from the Boost library. But result type will be not a std::string_view .

#include <iostream>
#include <string_view>
#include <boost/range.hpp>
#include <boost/range/join.hpp>

void test()
{
    std::string_view s1{"hello, "}, s2{"world"};
    auto joined = boost::join(s1, s2);

    // print joined string
    std::copy(joined.begin(), joined.end(), std::ostream_iterator(std::cout, ""));
    std::cout << std::endl;

    // other method to print
    for (auto c : joined) std::cout << c;
    std::cout << std::endl;
}

C++23 has joined ranges in the standard library with the name of std::ranges::views::join_with_view

#include <iostream>
#include <ranges>
#include <string_view>

void test()
{
    std::string_view s1{"hello, "}, s2{"world"};
    auto joined = std::ranges::views::join_with_view(s1, s2);

    for (auto c : joined) std::cout << c;
    std::cout << std::endl;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM