C ++字符串解析（python样式）

Question

我喜欢在python中我可以做的事情：

points = []
for line in open("data.txt"):
    a,b,c = map(float, line.split(','))
    points += [(a,b,c)]

基本上它正在读取行列表，其中每行代表3D空间中的一个点，该点表示为由逗号分隔的三个数字

如何在没有太多头痛的情况下在C ++中完成这项工作？

性能不是很重要，这种解析只发生一次，因此简单性更重要。

PS我知道这听起来像是一个新手问题，但请相信我，我在D中编写了一个词法分析器（非常类似于C ++），它涉及通过char读取一些文本字符并识别标记，
就是这样，经过长时间的蟒蛇回到C ++之后，只是让我不想浪费时间在这些事情上。

Answer 1

我会做这样的事情：

ifstream f("data.txt");
string str;
while (getline(f, str)) {
    Point p;
    sscanf(str.c_str(), "%f, %f, %f\n", &p.x, &p.y, &p.z); 
    points.push_back(p);
}

x，y，z必须是浮点数。

并包括：

#include <iostream>
#include <fstream>

Answer 2

C ++字符串工具包库（StrTk）具有以下解决方案：

#include <string>
#include <deque>
#include "strtk.hpp"

struct point { double x,y,z; }

int main()
{
   std::deque<point> points;
   point p;
   strtk::for_each_line("data.txt",
                        [&points,&p](const std::string& str)
                        {
                           strtk::parse(str,",",p.x,p.y,p.z);
                           points.push_back(p);
                        });
   return 0;
}

更多例子可以在这里找到

Answer 3

除了所有这些好的例子，在C ++中你通常会覆盖你的点类型的operator >>来实现这样的事情：

point p;
while (file >> p)
    points.push_back(p);

甚至：

copy(
    istream_iterator<point>(file),
    istream_iterator<point>(),
    back_inserter(points)
);

运算符的相关实现看起来非常像j_random_hacker的代码。

Answer 4

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>     // For replace()

using namespace std;

struct Point {
    double a, b, c;
};

int main(int argc, char **argv) {
    vector<Point> points;

    ifstream f("data.txt");

    string str;
    while (getline(f, str)) {
        replace(str.begin(), str.end(), ',', ' ');
        istringstream iss(str);
        Point p;
        iss >> p.a >> p.b >> p.c;
        points.push_back(p);
    }

    // Do something with points...

    return 0;
}

Answer 5

这个答案是基于j_random_hacker之前的回答，并使用了Boost Spirit。

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <boost/spirit.hpp>

using namespace std;
using namespace boost;
using namespace boost::spirit;

struct Point {
    double a, b, c;
};

int main(int argc, char **argv) 
{
    vector<Point> points;

    ifstream f("data.txt");

    string str;
    Point p;
    rule<> point_p = 
           double_p[assign_a(p.a)] >> ',' 
        >> double_p[assign_a(p.b)] >> ',' 
        >> double_p[assign_a(p.c)] ; 

    while (getline(f, str)) 
    {
        parse( str, point_p, space_p );
        points.push_back(p);
    }

    // Do something with points...

    return 0;
}

Answer 6

使用Boost.Tuples的乐趣：

#include <boost/tuple/tuple_io.hpp>
#include <vector>
#include <fstream>
#include <iostream>
#include <algorithm>

int main() {
    using namespace boost::tuples;
    typedef boost::tuple<float,float,float> PointT;

    std::ifstream f("input.txt");
    f >> set_open(' ') >> set_close(' ') >> set_delimiter(',');

    std::vector<PointT> v;

    std::copy(std::istream_iterator<PointT>(f), std::istream_iterator<PointT>(),
             std::back_inserter(v)
    );

    std::copy(v.begin(), v.end(), 
              std::ostream_iterator<PointT>(std::cout)
    );
    return 0;
}

请注意，这并不完全等同于您问题中的Python代码，因为元组不必在单独的行上。 例如，这个：

1,2,3 4,5,6

将给出相同的输出：

1,2,3
4,5,6

由你来决定这是一个错误还是一个功能:)

Answer 7

您可以逐行从std :: iostream中读取文件，将每行放入std :: string，然后使用boost :: tokenizer将其拆分。 它不会像蟒蛇那样优雅/短暂，但比一次阅读角色中的东西容易得多......

Answer 8

索尼Picture Imagework的开源项目之一是Pystring ，它应该主要直接翻译字符串分割部分：

Pystring是C ++函数的集合，它使用std :: string匹配python的字符串类方法的接口和行为。 在C ++中实现，它不需要或使用python解释器。 它为标准C ++库中未包含的常见字符串操作提供了便利和熟悉

有一些例子和一些文档

Answer 9

它远不如简洁，当然我没有编译。

float atof_s( std::string & s ) { return atoi( s.c_str() ); }
{ 
ifstream f("data.txt")
string str;
vector<vector<float>> data;
while( getline( f, str ) ) {
  vector<float> v;
  boost::algorithm::split_iterator<string::iterator> e;
  std::transform( 
     boost::algorithm::make_split_iterator( str, token_finder( is_any_of( "," ) ) ),
     e, v.begin(), atof_s );
  v.resize(3); // only grab the first 3
  data.push_back(v);
}

Answer 10

所有这些都是很好的例子。 但他们不回答以下问题：

包含不同列号的CSV文件（某些行的列数多于其他列）
或者当某些值具有空格时（ya yb，x1 x2 ,, x2）

所以对于那些仍在寻找的人来说，这个课程： http ： //www.codeguru.com/cpp/tic/tic0226.shtml非常酷......可能需要进行一些更改

C ++字符串解析（python样式）

问题描述

10 个解决方案

解决方案1
24 已采纳 2009-02-11 10:33:17

解决方案2
17

解决方案3
16 2009-02-11 11:45:35

解决方案4
14 2009-02-11 09:59:21

解决方案5
7 2009-02-11 10:19:55

解决方案6
4 2009-02-11 16:01:43

解决方案7
3 2009-02-11 09:55:27

解决方案8
1 2009-10-25 14:19:26

解决方案9
1 2009-02-11 20:58:00

解决方案10
1 2011-04-19 15:15:48

C ++字符串解析（python样式）

问题描述

10 个解决方案

解决方案1 24 已采纳 2009-02-11 10:33:17

解决方案2 17

解决方案3 16 2009-02-11 11:45:35

解决方案4 14 2009-02-11 09:59:21

解决方案5 7 2009-02-11 10:19:55

解决方案6 4 2009-02-11 16:01:43

解决方案7 3 2009-02-11 09:55:27

解决方案8 1 2009-10-25 14:19:26

解决方案9 1 2009-02-11 20:58:00

解决方案10 1 2011-04-19 15:15:48

解决方案1
24 已采纳 2009-02-11 10:33:17

解决方案2
17

解决方案3
16 2009-02-11 11:45:35

解决方案4
14 2009-02-11 09:59:21

解决方案5
7 2009-02-11 10:19:55

解决方案6
4 2009-02-11 16:01:43

解决方案7
3 2009-02-11 09:55:27

解决方案8
1 2009-10-25 14:19:26

解决方案9
1 2009-02-11 20:58:00

解决方案10
1 2011-04-19 15:15:48