简体   繁体   English

在读取文件但无限循环的 C 中创建 function

[英]Creating a function in C that reads a file but it infinitely loops

I basically want to print out all of the lines in a file i made but then it just loops over and over back to the start basically cuz in the function i set fseek(fp, 0, SEEK_SET);我基本上想打印出我制作的文件中的所有行,但它只是一遍又一遍地循环回到开头,基本上因为在 function 我设置了fseek(fp, 0, SEEK_SET); this part but idk how otherwise i would place it to get through all the other lines im basically going back to the start every time.这部分,但我不知道我将如何放置它以通过所有其他线路,我基本上每次都回到开始。

#include<stdio.h>
#include <stdlib.h>

char *freadline(FILE *fp);

int main(){

    FILE *fp = fopen("story.txt", "r");

    if(fp == NULL){
        printf("Error!");
    }else{
        char *pInput;
        while(!feof(fp)){
            pInput = freadline(fp);
            printf("%s\n", pInput); // outpu 
        }   
    }
    
    return 0;
}

char *freadline(FILE *fp){
    int i;
    for(i = 0; !feof(fp); i++){
        getc(fp);
    }
    fseek(fp, 0, SEEK_SET); // resets to origin (?)
    char *pBuffer = (char*)malloc(sizeof(char)*i);
    pBuffer = fgets(pBuffer, i, fp);

    return pBuffer;
}

this is my work so far这是我到目前为止的工作

Continuing from my comments, you are thinking along the correct lines, you just haven't put the pieces together in the right order.继续我的评论,你的想法是正确的,你只是没有按照正确的顺序把各个部分放在一起。 In main() you are looping calling your function, allocating for a single line, and then outputting a line of text and doing it over and over again.main()中,您正在循环调用 function,分配一行,然后输出一行文本并一遍又一遍地执行此操作。 (and without freeing any of the memory you allocate -- creating a memory leak of every line read) (并且没有释放您分配的任何 memory - 创建每行读取的 memory 泄漏)

If you are allocating storage to hold the line, you will generally want to read the entire file in your function in a single pass though the file allocating for, and storing all lines, and returning a pointer to your collection of lines for printing in main() (or whatever the calling function is).如果您正在分配存储来保存该行,您通常希望通过文件分配和存储所有行,并返回指向您的行集合的指针,以便在main()中打印main() (或任何调用 function 是)。

You do that by adding one additional level of indirection and having your function return char ** .您可以通过添加一个额外的间接级别并让您的 function 返回char **来做到这一点。 This is a simple two-step process where you:这是一个简单的两步过程,您可以:

  1. allocate a block of memory containing pointers (one pointer for each line).分配一个包含指针的 memory 块(每行一个指针)。 Since you will not know how many lines beforehand, you simply allocate some number of pointers and then realloc() more pointers when you run out;由于您事先不知道有多少行,因此您只需分配一些指针,然后在realloc()时重新分配更多指针;

  2. for each line you read, you allocate length + 1 characters of storage and copy the current line to that block of memory assigning the address for the line to your next available pointer.对于您读取的每一行,您分配length + 1字符的存储空间并将当前行复制到 memory 的该块中,将该行的地址分配给您的下一个可用指针。

(you either keep track of the number of pointers and lines allocated, or provide an additional pointer set to NULL as a Sentinel after the last pointer assigned a line -- up to you, simply keeping track with a counter is likely conceptually easier) (您要么跟踪分配的指针和行数,要么在最后一个指针分配一行后提供一个设置为NULL的附加指针作为哨兵——由您决定,只需使用计数器跟踪可能在概念上更容易)

After reading your last line, you simply return the pointer to the collection of pointers which is assigned for use back in the caller.阅读最后一行后,您只需将指针返回到分配给调用者使用的指针集合。 (you can also pass the address of a char ** as a parameter to your function, resulting in the type being char *** , but being a Three-Star Programmer isn't always a compliment). (您也可以将char **的地址作为参数传递给您的 function,导致类型为char *** ,但成为三星级程序员并不总是一种恭维)。 However, there is nothing wrong with doing it that way, and in some cases, it will be required, but if you have an alternative, that is generally the preferred route.但是,这样做并没有错,在某些情况下,它是必需的,但如果你有替代方案,那通常是首选路线。

So how would this work in practice?那么这将如何在实践中发挥作用呢?

Simply change your function return type to char ** and pass an additional pointer to a counting variable so you can update the value at that address with the number of lines read before you return from your function.只需将 function 返回类型更改为char **并将附加指针传递给计数变量,以便在从 function 返回之前使用读取的行数更新该地址处的值。 Eg, you could do:例如,你可以这样做:

char **readfile (FILE *fp, size_t *n);

Which will take your file pointer to an open file stream and then read each line from it, allocating storage for the line and assigning the address for that allocation to one of your pointers.这会将您的文件指针指向打开的文件 stream ,然后从中读取每一行,为该行分配存储空间并将该分配的地址分配给您的一个指针。 Within the function, you would use a sufficiently sized character array to hold each line you read with fgets() .在 function 中,您将使用足够大的字符数组来保存您使用fgets()读取的每一行。 Trim the '\n' from the end and get the length and then allocate length + 1 bytes to hold the line.从末尾修剪'\n'并获取长度,然后分配length + 1个字节来保存该行。 Assign the address for the newly allocated block to a pointer and copy from your array to the newly allocated block.将新分配的块的地址分配给一个指针,然后从您的数组复制到新分配的块。

( strdup() can both allocate and copy, but it is not part of the standard library, it's POSIX -- though most compilers support it as an extension if you provide the proper options) strdup()既可以分配也可以复制,但它不是标准库的一部分,它是 POSIX ——尽管如果您提供正确的选项,大多数编译器都支持它作为扩展)

Below the readfile() function puts it altogether, starting with a single pointer and reallocation twice the current number when you run out (that provides a reasonable trade-off between the number of allocations needed and the number of pointers. (after just 20 calls to realloc() , you would have 1M pointers). You can choose any reallocation and growth scheme you like, but you want to avoid calling realloc() for every line -- realloc() is still a relatively expensive call.readfile()下面 function 把它放在一起,从一个指针开始,当你用完时重新分配两倍的当前数字(这提供了所需的分配数量和指针数量之间的合理权衡。(仅 20 次调用后)到realloc() ,你将有 1M 指针)。你可以选择任何你喜欢的重新分配和增长方案,但你要避免为每一行调用realloc() —— realloc()仍然是一个相对昂贵的调用。

#define MAXC 1024       /* if you need a constant, #define one (or more) */
#define NPTR    1       /* initial no. of pointers to allocate */

/* readfile reads all lines from fp, updating the value at the address
 * provided by 'n'. On success returns pointer to allocated block of pointers
 * with each of *n pointers holding the address of an allocated block of
 * memory containing a line from the file. On allocation failure, the number
 * of lines successfully read prior to failure is returned. Caller is
 * responsible for freeing all memory when done with it.
 */
char **readfile (FILE *fp, size_t *n)
{
    char buffer[MAXC], **lines;                 /* buffer to hold each line, pointer */
    size_t allocated = NPTR, used = 0;          /* allocated and used pointers */
    
    lines = malloc (allocated * sizeof *lines); /* allocate initial pointer(s) */
    
    if (lines == NULL) {                        /* validate EVERY allocation */
        perror ("malloc-lines");
        return NULL;
    }
    
    while (fgets (buffer, MAXC, fp)) {          /* read each line from file */
        size_t len;                             /* variable to hold line-length */
        if (used == allocated) {                /* is pointer reallocation needed */
            /* always realloc to a temporary pointer to avoid memory leak if
             * realloc fails returning NULL.
             */
            void *tmp = realloc (lines, 2 * allocated * sizeof *lines);
            if (!tmp) {                         /* validate EVERY reallocation */
                perror ("realloc-lines");
                break;                          /* lines before failure still good */
            }
            lines = tmp;                        /* assign reallocted block to lines */
            allocated *= 2;                     /* update no. of allocated pointers */
        }
        buffer[(len = strcspn (buffer, "\n"))] = 0;     /* trim \n, save length */
        
        lines[used] = malloc (len + 1);         /* allocate storage for line */
        if (!lines[used]) {                     /* validate EVERY allocation */
            perror ("malloc-lines[used]");
            break;
        }
        
        memcpy (lines[used], buffer, len + 1);  /* copy buffer to lines[used] */
        used++;                                 /* increment used no. of pointers */
    }
    *n = used;              /* update value at address provided by n */
    
    /* can do final realloc() here to resize exactly to used no. of pointers */
    
    return lines;           /* return pointer to allocated block of pointers */
}

In main() , you simply pass your file pointer and the address of a size_t variable and check the return before iterating through the pointers making whatever use of the line you need (they are simply printed below), egmain()中,您只需传递文件指针和size_t变量的地址,并在迭代指针之前检查返回值,以使用您需要的任何行(它们简单地打印在下面),例如

int main (int argc, char **argv) {
    
    char **lines;       /* pointer to allocated block of pointers and lines */
    size_t n;           /* number of lines read */
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }
    
    lines = readfile (fp, &n);
    
    if (fp != stdin)    /* close file if not stdin */
        fclose (fp);

    if (!lines) {       /* validate readfile() return */
        fputs ("error: no lines read from file.\n", stderr);
        return 1;
    }
    
    for (size_t i = 0; i < n; i++) {            /* loop outputting all lines read */
        puts (lines[i]);
        free (lines[i]);                        /* don't forget to free lines */
    }
    free (lines);                               /* and free pointers */
    
    return 0;
}

( note: don't forget to free the memory you allocated when you are done. That become critical when you are calling functions that allocate within other functions. In main() , the memory will be automatically released on exit, but build good habits.) 注意:完成后不要忘记释放您分配的 memory。当您调用在其他函数中分配的函数时,这变得至关重要。在main()中,memory 将在退出时自动释放,但要养成良好的习惯.)

Example Use/Output示例使用/输出

$ ./bin/readfile_allocate dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.

The program will read any file, no matter if it has 4-lines or 400,000 lines up to the physical limit of your system memory (adjust MAXC if your lines are longer than 1023 characters).该程序将读取任何文件,无论它有 4 行还是 400,000 行,直到系统的物理限制 memory(如果您的行长于 1023 个字符,请调整MAXC )。

Memory Use/Error Check Memory 使用/错误检查

In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.在您编写的任何动态分配 memory 的代码中,对于分配的 memory 的任何块,您有两个责任:(1)始终保留指向 ZCD69B4957F06CD818D7BF3D21980 块的起始地址的指针,所以它可以被释放,更需要。

It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.您必须使用 memory 错误检查程序,以确保您不会尝试访问 memory 或写入超出/超出分配块的边界,尝试读取或基于未初始化值的条件跳转,最后确认释放所有已分配的 memory。

For Linux valgrind is the normal choice.对于 Linux valgrind是正常的选择。 There are similar memory checkers for every platform.每个平台都有类似的 memory 检查器。 They are all simple to use, just run your program through it.它们都易于使用,只需通过它运行您的程序即可。

$ valgrind ./bin/readfile_allocate dat/captnjack.txt
==4801== Memcheck, a memory error detector
==4801== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4801== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==4801== Command: ./bin/readfile_allocate dat/captnjack.txt
==4801==
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
==4801==
==4801== HEAP SUMMARY:
==4801==     in use at exit: 0 bytes in 0 blocks
==4801==   total heap usage: 10 allocs, 10 frees, 5,804 bytes allocated
==4801==
==4801== All heap blocks were freed -- no leaks are possible
==4801==
==4801== For counts of detected and suppressed errors, rerun with: -v
==4801== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Always confirm that you have freed all memory you have allocated and that there are no memory errors.始终确认您已释放所有已分配的 memory 并且没有 memory 错误。

The full code used for the example simply includes the headers, but is included below for completeness:该示例使用的完整代码仅包含标头,但为了完整起见,将其包含在下面:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAXC 1024       /* if you need a constant, #define one (or more) */
#define NPTR    1       /* initial no. of pointers to allocate */

/* readfile reads all lines from fp, updating the value at the address
 * provided by 'n'. On success returns pointer to allocated block of pointers
 * with each of *n pointers holding the address of an allocated block of
 * memory containing a line from the file. On allocation failure, the number
 * of lines successfully read prior to failure is returned. Caller is
 * responsible for freeing all memory when done with it.
 */
char **readfile (FILE *fp, size_t *n)
{
    char buffer[MAXC], **lines;                 /* buffer to hold each line, pointer */
    size_t allocated = NPTR, used = 0;          /* allocated and used pointers */
    
    lines = malloc (allocated * sizeof *lines); /* allocate initial pointer(s) */
    
    if (lines == NULL) {                        /* validate EVERY allocation */
        perror ("malloc-lines");
        return NULL;
    }
    
    while (fgets (buffer, MAXC, fp)) {          /* read each line from file */
        size_t len;                             /* variable to hold line-length */
        if (used == allocated) {                /* is pointer reallocation needed */
            /* always realloc to a temporary pointer to avoid memory leak if
             * realloc fails returning NULL.
             */
            void *tmp = realloc (lines, 2 * allocated * sizeof *lines);
            if (!tmp) {                         /* validate EVERY reallocation */
                perror ("realloc-lines");
                break;                          /* lines before failure still good */
            }
            lines = tmp;                        /* assign reallocted block to lines */
            allocated *= 2;                     /* update no. of allocated pointers */
        }
        buffer[(len = strcspn (buffer, "\n"))] = 0;     /* trim \n, save length */
        
        lines[used] = malloc (len + 1);         /* allocate storage for line */
        if (!lines[used]) {                     /* validate EVERY allocation */
            perror ("malloc-lines[used]");
            break;
        }
        
        memcpy (lines[used], buffer, len + 1);  /* copy buffer to lines[used] */
        used++;                                 /* increment used no. of pointers */
    }
    *n = used;              /* update value at address provided by n */
    
    /* can do final realloc() here to resize exactly to used no. of pointers */
    
    return lines;           /* return pointer to allocated block of pointers */
}

int main (int argc, char **argv) {
    
    char **lines;       /* pointer to allocated block of pointers and lines */
    size_t n;           /* number of lines read */
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }
    
    lines = readfile (fp, &n);
    
    if (fp != stdin)    /* close file if not stdin */
        fclose (fp);

    if (!lines) {       /* validate readfile() return */
        fputs ("error: no lines read from file.\n", stderr);
        return 1;
    }
    
    for (size_t i = 0; i < n; i++) {            /* loop outputting all lines read */
        puts (lines[i]);
        free (lines[i]);                        /* don't forget to free lines */
    }
    free (lines);                               /* and free pointers */
    
    return 0;
}

Let me know if you have further questions.如果您还有其他问题,请告诉我。

If you want to read a fine till the last line, you can simply use getc() .如果你想读到最后一行,你可以简单地使用getc() It returns EOF when end of file is reached or it fails to read.当到达文件末尾或读取失败时,它返回 EOF。 So, if using getc() , to make sure the end of file is reached, it's better to use feof() , which returns a non-zero value if end of file is reached, else 0.因此,如果使用getc()来确保到达文件末尾,最好使用feof() ,如果到达文件末尾则返回非零值,否则返回 0。

Example:-例子:-

int main()
{
  FILE *fp = fopen("story.txt", "r");
 int ch = getc(fp);
  while (ch != EOF) 
  {
    /* display contents of file on screen */ 
    putchar(ch); 

    ch = getc(fp);
   }

    if (feof(fp))
       printf("\n End of file reached.");
    else 
       printf("\nError while reading!");
    fclose(fp);
  
    getchar();
    return 0;
  }

you can also fgets() , below is the example:-你也可以fgets() ,下面是例子: -

#define MAX_LEN 256

int main(void)
{
    FILE *fp = fopen("story.txt", "r");
    if (fp == NULL) {
      perror("Failed: "); //prints a descriptive error message to stderr.
      return 1;
     }

    char buffer[MAX_LEN];
    // -1 to allow room for NULL terminator for really long string
    while (fgets(buffer, MAX_LEN - 1, fp))
    {
        // Remove trailing newline
        buffer[strcspn(buffer, "\n")] = 0;
        printf("%s\n", buffer);
     }

    fclose(fp);
    return 0;
}

alternatively,或者,

int main() {

int MAX_LEN = 255;
char buffer[MAX_LEN];

FILE *fp = fopen("story.txt", "r");

while(fgets(buffer, MAX_LEN, fp)) {
    printf("%s\n", buffer);
}

fclose(fp);
return 0;
}

For more details, refer C read file line by line更多详情请参考C 逐行读取文件

Your approach is wrong and it's very likely that it will generate an endless loop.您的方法是错误的,很可能会产生无限循环。 I'll explain why using the original code and inline comments:我将解释为什么使用原始代码和内联注释:

char *freadline(FILE *fp){
    int i;

    // This part attempts to count the number of characters
    // in the whole file by reading char-by-char until EOF is set
    for(i = 0; !feof(fp); i++){
        getc(fp);
    }

    // Here EOF is set

    // This returns to the start and clears EOF
    fseek(fp, 0, SEEK_SET);

    // Here EOF is cleared

    char *pBuffer = (char*)malloc(sizeof(char)*i);

    // Here you read a line, i.e. you read characters until (and including) the
    // first newline in the file.
    pBuffer = fgets(pBuffer, i, fp);

    // Here EOF is still cleared as you only read the first line of the file
    
    return pBuffer;
}

So in main when you do所以在你做的时候main

while(!feof(fp)){
    ...
}

you have an endless loop as feof is false.你有一个无限循环,因为feof是错误的。 Your program will print the same line again and again and you have memory leaks as you never call free(pInput) .您的程序将一次又一次地打印同一行,并且您有 memory 泄漏,因为您从不调用free(pInput)

So you need to redesign your code.所以你需要重新设计你的代码。 Read what fgets do, eg here https://man7.org/linux/man-pages/man3/fgets.3p.html阅读fgets做了什么,例如这里https://man7.org/linux/man-pages/man3/fgets.3p.html

A number of issues to address:需要解决的一些问题:

  • Using fgets does not guarantee that you read a line after the function returns.使用fgets并不能保证您在 function 返回后读取一行。 So if you really want to check whether you've read a complete line, check the number of characters in the returned string, and also check for the presence of a new-line character at the end of the string.因此,如果您真的想检查您是否阅读了完整的行,请检查返回字符串中的字符数,并检查字符串末尾是否存在换行符。

  • Your use of fseek is interesting here because what it does is to tell the stream pointer to go back to the start of the file, and start reading from there.您对fseek的使用在这里很有趣,因为它的作用是告诉 stream 指向 go 的指针回到文件的开头,并从那里开始读取。 This means that after the first time the freadline function is called, you will continue reading the first byte from the file each time.这意味着在第一次调用freadline function 后,您将继续每次从文件中读取第一个字节。

  • Lastly, your program is hoarding memory like a greedy baby!最后,你的程序像贪婪的婴儿一样囤积memory! You never free any of those allocations you did!您永远不会释放您所做的任何分配!

With that being said, here is an improved freadline implementation:话虽如此,这是一个改进的freadline实现:

char *freadline(FILE *fp) {
    /* initializations */
    char    buf[BUFSIZ + 1];
    char   *pBuffer;
    size_t  size = 0, tmp_size;

    /* fgets returns NULL when it reaches EOF, so our loop is conditional
     * on that
     */
    while (fgets (buf, BUFSIZ + 1, fp) != NULL) {
        tmp_size = strlen (buf);
        size += tmp_size;
        if (tmp_size != BUFSIZ || buf[BUFSIZ] == '\n')
            break;
    }

    /* after breaking from loop, check that size is not zero.
     * this should only happen if we reach EOF, so return NULL
     */
    if (size == 0)
        return NULL;

    /* Allocate memory for the line plus one extra for the null byte */
    pBuffer = malloc (size + 1);

    /* reads the contents of the file into pBuffer */
    if (size <= BUFSIZ) {
        /* Optimization: use memcpy rather than reading 
         * from disk if the line is small enough
         */
        memcpy (pBuffer, buf, size);
    } else {
        fseek (fp, ftell(fp) - size, SEEK_SET);
        fread (pBuffer, 1, size, fp);
    }
    pBuffer[size] = '\0'; /* set the null terminator byte */
    return pBuffer; /* remember to free () this when you are done! */
}

This way will not need the additional call to feof (which is often a hit and miss), and instead relies on what fgets returns to determine if we have reached the end of file.这种方式不需要对feof的额外调用(这通常是命中注定的),而是依赖于fgets返回的内容来确定我们是否已经到达文件末尾。

With these changes, it should be enough to change main to:通过这些更改,将main更改为:

int main() {

    FILE *fp = fopen("story.txt", "r");

    if(fp == NULL){
        printf("Error!");
    }else{
        char *pInput;
        /* here we just keep reading until a NULL string is returned */
        for (pInput = freadline(fp); pInput != NULL; pInput = freadline(fp))) {
            printf("%s", pInput); // output
            free (pInput);
        }
        fclose(fp);
    }
    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM