简体   繁体   English

我将如何阅读 go 并将此文本文件的信息分离到 arrays 中?

[英]How would I go about reading and separating this text file's information into arrays?

Suppose I have a text file such as:假设我有一个文本文件,例如:


Adam: Tall Handsome Kind Athlete
  He enjoys playing basketball

Sabrina: Short Pretty Funny Adorable Gymnast
  She loves gymnastics

Sinclair: Blonde
  He is blonde

Assume the file has several more people, each with a Name and then 0 or more characteristics about them and then a new line with a tab following a sentence underneath.假设该文件还有几个人,每个人都有一个名称,然后是关于他们的 0 个或多个特征,然后是一个新行,下面的句子后面有一个制表符。 For example,例如,

Adam: would be the name亚当:会是名字

Tall Handsome Kind Athlete would be 4 individual characteristics高大英俊的运动员将是 4 个个人特征

He enjoys playing basketball would be the sentence.他喜欢打篮球就是这句话。

I want to store this information in a structure like so:我想将此信息存储在如下结构中:

typedef struct People {
 char *name;
 char **characteristics;
 char *sentence;
} People;

typedef struct List {
  People **list;
  int total_ppl;
} List;

int main (void) {
 List *ppl_list = malloc(sizeof(List));
 ppl_list->list = malloc(sizeof(People));
 int i = 0;
 FILE *pf = fopen("input.txt", "r");
 if (pf == NULL) {
    printf("Unable to open the file");
 } else {
   
/*    I'm not sure how to go about reading the information from here. I was thinking about using
      fscanf but I don't know how to separate and store the Name, each Characteristic, and 
      the sentence separately. I know I will need to realloc ppl_list as more people are read from the 
      file. If I should change my structs to organize it better, please let me know.
*/

 }
}

There is a function called strtok(): https://www.tutorialspoint.com/c_standard_library/c_function_strtok.htm有一个名为 strtok() 的function:https://www.tutorialspoint.com/c_standard_library/c_function_strtok.htm

Though I've never used it, the one time I had to do this I implemented a function that would separate a string into an array of pointers to pointers of chars and dynamically allocated memory for the whole block, it would look for commas and generate a new string each time it found one, my code wouldn't work in your case because it was written to specifically ignore withe spaces, though if you're not ignoring them but using them as delimitators the code gets a lot simpler.虽然我从未使用过它,但有一次我不得不这样做,我实现了一个 function,它将一个字符串分成一个指向字符指针的指针数组,并为整个块动态分配 memory,它会查找逗号并生成每次找到一个新字符串时,我的代码在您的情况下不起作用,因为它是专门为忽略空格而编写的,但如果您不忽略它们而是将它们用作分隔符,则代码会变得简单得多。

Edit: as for getting the info out of the file I would create a buffer of an absurd size like say 32768 Bytes and use fgets(buffer, 32768, pf), though you may wanna add a check to see if even 32K chars weren't enough for the read and to deal with that, though I imagine it wouldn't be necessary.编辑:至于从文件中获取信息,我会创建一个大小荒谬的缓冲区,例如 32768 字节并使用 fgets(buffer, 32768, pf),尽管您可能想添加一个检查以查看是否有 32K 字符足够阅读和处理它,尽管我认为这没有必要。

Also this were the prototypes of the functions i implemented once, to give you a better idea of what you'd have to code:这也是我曾经实现的功能的原型,让您更好地了解您必须编写的代码:

char        **separete_string   (char *string, char delim);
void        free_string_list    (char **list);

This answer is maybe not complete but it will help you, at least, with the processing of the lines in the text file.这个答案可能不完整,但至少可以帮助您处理文本文件中的行。

Assuming a file.txt as the input file, and with the following format假设一个file.txt作为输入文件,格式如下

Adam: Tall Handsome Kind Athlete
  He enjoys playing basketball

Sabrina: Short Pretty Funny Adorable Gymnast
  She loves gymnastics

Sinclair: Blonde
  He is blonde

We can process this file as follows我们可以如下处理这个文件

#include <string.h>
#include <stdio.h>
#include <stdlib.h>


/*process_info_line: process the line with the name and attributes*/
int process_info_line(char *line)
{
    char *next = NULL;
    char *part = strtok_r(line, ":", &next);
    if (part)
        printf("NAME: %s\n", part);
    else
        return 0;

    while (part != NULL) {
        part = strtok_r(NULL, " ", &next);
        if (part)
            printf("ATTRIBUTE: %s\n", part);
    }

    return 0;
}

/*process_sentence: process the line with the sentence*/
char *process_sentence(char *line)
{
    line = line + 4;
    return line;
}

/*is_sentence: checks if the line is a sentence, given your definition
 * with a tab(or 4 spaces) at the begining*/
int is_sentence(char *line)
{
    if (strlen(line) == 0)
        return 0;

    char *ptr = line;
    int space_count = 0;

    while (ptr != NULL) {
        if (strncasecmp(ptr, " ", 1) != 0) {
            break;
        }
        space_count++;
        ptr++;
    }

    if (space_count == 4)
        return 1;

    return 0;
}

/*scan_file: read each of the lines of the file and use
 * the previous functions to process it.*/
int scan_file(char *filename)
{
    char *line_buf = NULL;
    size_t line_buf_size = 0;
    ssize_t line_size;
    int line_count = 0;

    FILE *fp = fopen(filename, "r");
    if (!fp) {
        fprintf(stderr, "Error opening file '%s'\n", filename);
        return 1;
    } 
    /*Get the first line of the file*/
    line_size = getline(&line_buf, &line_buf_size, fp);
    while (line_size >= 0)
    {
        line_count++;
        line_buf[line_size-1] = '\0'; /*removing '\n' */
        if (is_sentence(line_buf)) {
            printf("SENTENCE: %s\n", process_sentence(line_buf));
        } else {
            process_info_line(line_buf);
        }
        
        line_size = getline(&line_buf, &line_buf_size,fp);
    }

    // don't forget to free the line_buf used by getline
    free(line_buf);
    line_buf = NULL;
    fclose(fp);

    return 0;
}


int main(void)
{
    scan_file("file.txt");
    return 0;
}

This will generate the following output这将生成以下 output

NAME: Adam
ATTRIBUTE: Tall
ATTRIBUTE: Handsome
ATTRIBUTE: Kind
ATTRIBUTE: Athlete
SENTENCE: He enjoys playing basketball
NAME: Sabrina
ATTRIBUTE: Short
ATTRIBUTE: Pretty
ATTRIBUTE: Funny
ATTRIBUTE: Adorable
ATTRIBUTE: Gymnast
SENTENCE: She loves gymnastics
NAME: Sinclair
ATTRIBUTE: Blonde
SENTENCE: He is blonde

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM