简体   繁体   English

在C ++中的两个分隔符上拆分字符串

[英]Splitting a string on two delimitators in C++

I have a file, cities.txt, containing: 我有一个文件,citys.txt,其中包含:

Hayward - San Lorenzo
San Lorenzo - Oakland
Dublin - San Jose
San Mateo - Hayward
San Francisco - Daly City
San Mateo - Oakland
San Francisco - Oakland
Freemont - Hayward
San Lorenzo - Dublin
San Jose - San Mateo
Daly City - San Raphael

I read the contest of the file with: 我通过以下方式阅读了文件竞赛:

#include <iostream>
#include <fstream>
#include <string>
#include <iterator>



int main( ) {
    std::ifstream infile( "cities.txt" ) ;
    if ( infile ) {
        std::string fileData( ( std::istreambuf_iterator<char> ( infile ) ) ,
        std::istreambuf_iterator<char> ( ) ) ;
        infile.close( );
        std::cout << fileData <<"\n\n";
        return 0 ;
   }
   else {
      std::cout << "Where is cities.txt?\n" ;
      return 1 ;
   }
}

and save the contents in the fileData string. 并将内容保存在fileData字符串中。 I need to break that string into a list of strings that only contain the names of the cities. 我需要将该字符串分成仅包含城市名称的字符串列表。 Something like this: 像这样:

list = {"Hayward","San Lorenzo", "San Lorenzo", "Oakland"......}

I was going to turn the string into char* and use strtok but it seems lika a lot of work for something that can probably be done using standard string functions. 我打算将字符串转换为char *并使用strtok,但是似乎可以使用标准字符串函数完成很多工作。 Is there a way to do it that is both fast and terse? 有没有办法既快速又简洁?

I would probably use std::getline , specifying - as the separator between elements: 我可能会使用std::getline ,将-指定为元素之间的分隔符:

std::string city;
while (std::getline(i, city, '-'))
    cities.push_back(city);

One minor detail: this will leave white-space intact, so if leading and/or trailing white-space is a problem, you'll have to trim it separately. 一个小细节:这将保留空白,因此,如果前导和/或尾随空白是一个问题,则必须单独进行修剪。

You can do this in couple of steps. 您可以分两步执行此操作。

  1. Split content of the file into vector of strings - so, each element of your vector will contain single row of the file 将文件的内容分割为字符串向量-因此,向量的每个元素将包含文件的单行

  2. Split each row of the file into two elements (two cities in the row) 将文件的每一行拆分为两个元素(该行中的两个城市)

  3. Trim content 修剪内容

split function can be implemented like this: split函数可以这样实现:

vector<string> split (string str, string seq) { 
    vector<string> ret {};
    size_t pos {};

    while ((pos = str.find (seq)) != string::npos) { 
        ret.push_back (str.substr (0, pos));
        str = str.substr (pos+seq.size ()); 
    }
    ret.push_back (str);

    return ret;
}

Trimming functions can be implemented like this: 整理功能可以这样实现:

string ltrim (string s) { 
    s.erase (s.begin (), find_if (s.begin (), s.end (), not1 (ptr_fun<int, int> (isspace))));
    return s;
}

string rtrim (string s) { 
    s.erase (find_if (s.rbegin (), s.rend (), not1 (ptr_fun<int, int> (isspace))).base (), s.end ());
    return s;
}

string trim (string s) { 
    return ltrim (rtrim (s));
}

So, basically you have all you need, let's get prepare a result function. 因此,基本上,您有了所有需要的东西,让我们准备一个结果函数。

vector<string> result (vector<string>&& content) {
    vector<string> ret {};
    for (const auto& c : content) { 
        auto vec = split (c, "-"); // (2)
        for (const auto& v : vec) { 
            ret.push_back (trim (v));
        }

    }
    return ret;
}

void show (const vector<string>& vec) { 
    for (const auto& v : vec) { 
        cout << "|" << v << "|" << endl;
    }
}

and usage looks like this, assuming that content of your file is in the content object. 并假设文件内容位于content对象中,使用情况如下所示。

auto vec = result (split (content, "\n")); // (1)
show (vec);

Now, some explanation is needed. 现在,需要一些解释。 Let's take a look at the (1) we take a whole content of the file (I missed retrieving content from the file) and create a vector of strings and in this case it is vector of rows (from the file, because seq uence is "\\n"). 让我们来看看(1)我们把文件的全部内容(我错过了检索从文件的内容),并创建一个字符串矢量,在这种情况下,它是一个行向量(从文件,因为以次 uence是“\\ n”)。 So, we pass to the result function vector of rows from the file. 因此,我们将传递文件中行的结果函数向量。 Ok, simple, let's go ahead. 好吧,简单,让我们继续。 Now we have to split this row into two strings (cities) (2) , but our seq uence is now "-". 现在我们就来此行分成两个字符串(市)(2),但我们的SEQ uence现在是“ - ”。 This (2) call will produce vector of strings, which will contain name of the cities. (2)调用将产生字符串向量,其中将包含城市名称。 Now, all we have to do is to add these names to the vector ret which will be returned, but firstly trimming content to make all white spaces from left and right side go away. 现在,我们要做的就是将这些名称添加到将返回的矢量ret中,但是首先修剪内容以使左侧和右侧的所有空白都消失。

The result is: 结果是:

|Hayward|
|San Lorenzo|
|San Lorenzo|
|Oakland|
|Dublin|
|San Jose|
|San Mateo|
|Hayward|
|San Francisco|
|Daly City|
|San Mateo|
|Oakland|
|San Francisco|
|Oakland|
|Freemont|
|Hayward|
|San Lorenzo|
|Dublin|
|San Jose|
|San Mateo|
|Daly City|
|San Raphael|

You can work with string::find, string::erase and string::substr 您可以使用string :: find,string :: erase和string :: substr

Use a while loop with something like found = input.find("-"); while(found != string::npos){... } 使用while循环,类似于found = input.find("-"); while(found != string::npos){... } found = input.find("-"); while(found != string::npos){... }

In the while Substring to the city names and erase the city afterwards from the whole string with .erase(position, length) 在while子字符串中输入城市名称,然后使用.erase(position,length)从整个字符串中删除城市

You may use boost regex_split. 您可以使用boost regex_split。 I have modified your code to demonstrate the same. 我已经修改了您的代码以演示相同的内容。 Pasted below: 粘贴在下面:

#include <iostream>
#include <fstream>
#include <string>
#include <iterator>
#include <boost/regex.hpp>
#include <vector>



int main( ) {
    std::ifstream infile( "cities.txt" ) ;
    if ( infile ) {
        std::string fileData( ( std::istreambuf_iterator<char> ( infile ) ) ,
        std::istreambuf_iterator<char> ( ) ) ;
        infile.close( );
        std::cout << fileData <<"\n\n";
        std::vector<std::string> out;

        // Delimeter regular expression
        boost::regex delims("\\s+-\\s+|\n|\r");

        boost::regex_split(std::back_inserter(out), fileData, delims);
        for (auto &city : out) {
            std::cout << city << std::endl;
        }
   }
   else {
      std::cout << "Where is cities.txt?\n" ;
      return 1 ;
   }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM