简体   繁体   English

从文本文件中提取不均匀数据

[英]Extracting Uneven Data From Text File

I have a data file which contains values for nine different variables on each line: x , y , z , v_x , v_y , v_z , m , ID , V .我有一个数据文件,其中每行包含九个不同变量的值: xyzv_xv_yv_zmIDV I am writing a program to extract just the x , y , and z values from the data file.我正在编写一个程序来仅从数据文件中提取xyz值。 I am relatively new to this type of procedure, and I am running into problems doing this because the values are not always of the same length.我对这种类型的过程比较陌生,并且在执行此操作时遇到了问题,因为值的长度并不总是相同。 An example of a portion of the data file is here (only the x , y , z columns):数据文件的一部分示例如下(仅xyz列):

2501.773926 1701.783081 211.1383057

1140.961426 4583.300781 322.4959412 

1194.471313 5605.764648 1377.315552 

506.1424866 6037.965332 1119.67041  

213.5106354 5788.785156 2340.610352 

59.43727493 5914.666016 2357.921143 

1223.028564 4292.818848 3007.292725 

4445.61377  3684.48999  2903.169189 

5649.732422 4596.819824 2661.301025 

5741.396973 5503.06543  2412.082031 

4806.246094 5587.194336 2676.126465 

4855.521973 5482.893066 2743.014648 

5190.890625 5399.349121 1549.1698   

Note how in most instances the length of each number is eleven spaces, but this is not always the case.请注意,在大多数情况下,每个数字的长度是 11 个空格,但情况并非总是如此。 The code that I have written is here:我写的代码在这里:

#include <cmath>
#include <cstdlib>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>

using namespace std;

// data created by Gadget2
const string gadget_data("particles_64cubed.txt");

int main()
{

cout << "GADGET2: Extracting Desired Data From ASCII File." << endl;

// declaring vectors to store the data
int bins = 135000000; // 512^3 particles = 134,217,728 particles
vector<double> x(bins), y(bins), z(bins);


// read the data file
ifstream data_file(gadget_data.c_str());
if (data_file.fail()) 
{
    cerr << "Cannot open " << gadget_data << endl;
    exit(EXIT_FAILURE);
} 
else
    cout << "Reading data file: " << gadget_data << endl;
string line;
int particles = 0;
while (getline(data_file, line)) 
{
    string x_pos = line.substr(0, 11);
    double x_val = atof(x_pos.c_str());    // atof converts string to double
    string y_pos = line.substr(12, 11);
    double y_val = atof(y_pos.c_str());
    string z_pos = line.substr(24, 11);
    double z_val = atof(z_pos.c_str());

    if (particles < bins) 
    {
        x[particles] = x_val;
        y[particles] = y_val;
        z[particles] = z_val;
        ++particles;
    }
}
data_file.close();
cout << "Stored " << particles << " particles in positions_64.dat" << endl;

vector<double> x_values, y_values, z_values;
for (int i = 0; i < particles; i++) 
{
    x_values.push_back(x[i]);
    y_values.push_back(y[i]);
    z_values.push_back(z[i]);
}

// write desired data to file
ofstream new_file("positions_64.dat");
for (int i = 0; i < x_values.size(); i++)
    new_file << x_values[i] << '\t' << y_values[i] << '\t' << z_values[i] << endl;
new_file.close();
cout << "Wrote desired data to file: " << "positions_64.dat" << endl;

}

The code obviously fails because of the non-constant lengths for each value.由于每个值的长度不固定,代码显然失败了。 Does anyone know of another method to achieve this?有谁知道另一种方法来实现这一目标? Perhaps something other than substring and spanning a specific length of characters, but something that grabs the values up to a white space?也许不是子字符串和跨越特定长度的字符,而是将值抓取到空格的东西? Any help would be appreciated.任何帮助,将不胜感激。 Thank you!谢谢!

I noticed you are already reading the file using ifstream and getline .我注意到您已经在使用ifstreamgetline读取文件。 Why did you fall back to cutting the line into chunks of N characters and atof 'ing them?为什么你退回到将行切割成 N 个字符的块和atof 'ing 它们? I mean, iostreams can read and write into integers, doubles, etc, best seen on the example of cin and cout .我的意思是,iostreams 可以读取和写入整数、双精度数等,最好在cincout示例中看到。

There's a istringstream class which would easily help you:有一个istringstream类可以轻松帮助您:

std::istringstream input(line); // line is std::string from getline()
double x,y,z;
if(input >> x >> y >> z) // just this! and it's already a simple error check
    ; // do something with x,y,z
else
    ; // handle the error

It should just work, because you already have line-by-line reading, and because the data is separated by whitespaces, which are by default ignored by >> operator.它应该可以正常工作,因为您已经进行了逐行读取,并且数据由空格分隔,默认情况下, >>运算符会忽略空格。

FYI: istringstream仅供参考: istringstream

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM