简体   繁体   English

读取C中具有可变列数的空格分隔文件

[英]Read space delimited file with variable number of columns in C

So I have files formatted as follows: 所以我的文件格式如下:

2
4 8 4 10 6
9 6 74 

The first line is actually the number of rows that the file will have after it. 第一行实际上是文件之后的行数。 I want to read the files line by line (note there are different number of tokens in each line but all have the format: 1 token and then an unspecified number of pairs of tokens) and do two things for each line: 我想逐行读取文件(请注意,每行中都有不同数量的令牌,但都具有以下格式:1个令牌,然后是未指定数量的令牌对),并为每行做两件事:

1) Know how many tokens are in this line. 1)知道此行中有多少个令牌。

2) Assign each token to a variable. 2)将每个令牌分配给一个变量。 Using structures similar to: 使用类似于以下内容的结构:

typedef struct {
  unsigned start; //start node of a graph 
  unsigned end;   // end node of a graph
  double weight;  //weight of the edge going from start to end
} edge ;

typedef struct {
  unsigned id;   // id of the node
  unsigned ne;   // number of edges adjacent to node
  edge *edges;   // array of edge to store adjacent edges of this node
} node;

Some code: 一些代码:

FILE *fin;
unsigned nn;
node *nodes;

fin = fopen ("input.txt", "r");
fscanf(fin,"%u\n", &nn);

nodes = malloc(nn*sizeof(node));

for(i=0; i < nn; i++) { //loop through all the rows
/*grab the row and split in parts, let's say they are part[0], part[1]... */
/*and there are N tokens in the row*/
  nodes[i].id=part[0];
  nodes[i].ne=(N-1)/2; //number of pairs excluding first element
  nodes[i].edges=malloc( (N-1)/2)*sizeof(edge) );
  for(j=0; j< (N-1)/2; j++){
    nodes[i].edges[j].start=part[0];
    nodes[i].edges[j].end=part[2*j+1];
    nodes[i].edges[j].weight=part[2*j+2];
  }
}

I need to figure out how to do the part comented inside the first for loop to get the number of tokens and each one of them as a simgle token to asign. 我需要弄清楚如何处理在第一个for循环内合并的部分,以获取令牌的数量,并将每个令牌作为要分配的单个令牌。 Any ideas? 有任何想法吗?

EDIT: to make things clear, each line will have first one integer, and then a variable number of pairs. 编辑:为了清楚起见,每行将有第一个整数,然后是可变数量的对。 I want to store data as follows: 我要存储数据如下:

if the file reads 如果文件读取

2
4 8 4 10 6 //(2 pairs)
9 6 74 //(1 pair)   

then 然后

nn=2;

node[0].id=4;
node[0].ne=2; //(2 pairs)
node[0].(*edges) //should be a vector of dimension ne=2 containing elements of type edge

node[0].edges[0].start=4; //same as node[0].id
node[0].edges[0].end=8;
node[0].edges[0].weight=4;

node[0].edges[1].start=4; //same as node[0].id
node[0].edges[1].end=10;
node[0].edges[1].weight=6;

node[1].id=9;
node[1].ne=1; //(1 pair)
node[1].(*edges) //should be a vector of dimension ne=1 containing elements of type edge

node[1].edges[0].start=9; //same as node[1].id
node[1].edges[0].end=6;
node[1].edges[0].weight=74;

This code produces the results you described, It initializes your nested struct member edge , and uses strtok. 该代码产生您描述的结果,它初始化嵌套的struct成员edge ,并使用strtok。 With strtok() , I included the \\n as part of the delimiter in addition to a space " \\n" to prevent the newline from giving us trouble (see other comments on that below) 使用strtok() ,除了空格" \\n"之外,我还将\\ n作为分隔符的一部分,以防止换行给我们带来麻烦(请参阅下面的其他评论)

Note: you have to free memory where I have indicated, but before you do, preserve the intermediate results (in the structs) or it will be lost. 注意:您必须在我指定的位置释放内存,但是在这样做之前,请保留中间结果(在结构中),否则它将丢失。

#include <ansi_c.h>

typedef struct {
  unsigned start;
  unsigned end;
  double weight;
} edge ;

typedef struct {
  unsigned id;
  unsigned ne;
  edge *edges;
} node;

int GetNumPairs(char *buf);

int main(void)
{
    FILE *fp;
    char *tok;
    char lineBuf[260];
    int i=0, j=0;
    int nn; //number of nodes
    char countPairsBuf[260];

    fp = fopen("C:\\dev\\play\\numbers.txt", "r");
    //get first line of file for nn:
    fgets (lineBuf, sizeof(lineBuf), fp);
    nn = atoi(lineBuf);
    //create array of node with [nn] elements
    node n[nn], *pN;
    pN = &n[0];

    //read rest of lines, (2 through end)
    i = -1;
    while(fgets (lineBuf, sizeof(lineBuf), fp))
    {
        i++;
        //get number of items in a line
        strcpy(countPairsBuf, lineBuf);
        pN[i].ne = GetNumPairs(countPairsBuf); //number of edges (pairs)
        if(pN[i].ne > 0)
        {   //allocate *edges struct element
            pN[i].edges = malloc((pN[i].ne)*sizeof(edge));
            //get first item in new line as "line token" and "start"
            tok = strtok(lineBuf, " \n");
            while(tok)
            {
                pN[i].id = atoi(tok);
                //now get rest of pairs
                for(j=0;j<pN[i].ne;j++)
                {
                    pN[i].edges[j].start = pN[i].id;
                    tok = strtok(NULL, " \n");
                    pN[i].edges[j].end = atoi(tok);
                    tok = strtok(NULL, " \n");
                    pN[i].edges[j].weight = atoi(tok);
                }
                tok = strtok(NULL, " \n"); //should be NULL if file formatted right
            }
        }
        else  //pN[i].ne = -1
        {
            //error, file line did not contain odd number of elements   
        }

    }
    //you have to free memory here
    //but I will leave that to you
    fclose(fp);

}


//GetNumPairs
int GetNumPairs(char *buf)
{
    int len = strlen(buf);
    int numWords=0, i, cnt=0;

    for(i=0;i<len;i++)
    {
        if ( isalpha ( buf[i] ) ) cnt++;
        else if ( ( ispunct ( buf[i] ) ) || ( isspace ( buf[i] ) ) )
        {
            numWords++;
            cnt = 0;
        }
    }//if odd number of "words", return number of pairs, else error
    return (((numWords-1)%2) == 0) ? ((numWords-1)/2) : (-1);
}   

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM