简体   繁体   中英

How can i read a matrix text file into a linked list in C using pointers?

I have to read a text file of letters using matrix linked lists, where each letter must have 8 pointers around it.

Here is what i have to do: enter image description here

The text file is this:

JDCPCPXOAA
ZXVOVXFRVV
NDLEIRBIEA
YTRQOMOIIO
FZZAPXERTQ
XAUEOEOOTO
PORTUOAZLZ
CZNOQUPUOP

In my code i only can read the letters in the first line.

Can someone help me?

typedef struct letter           ///estrutura para cada letra da sopa
{
    char *lname;
    struct letter *pnext;
}LETTER;

typedef struct soup         ///estrutura para a sopa de letras
{
    int lin;
    int col;
    LETTER *pfirst;
}SOUP;

void read_soup_txt(SOUP *pcs,char *fn,int lin,int col)
{
  FILE *fp;
  fp=fopen(fn,"r");
  char c;
  if(fp!=NULL)
  {
    pcs->lin=lin;
    pcs->col=col;
    LETTER *current=malloc(sizeof(LETTER)),*previous;
    pcs->pfirst=current;

    for(int i=0;i<pcs->lin;i++)     ///linhas
    {
      for(int j=0;j<pcs->col;j++)     ///colunas
      {
        fscanf(fp,"%c",&c);                     ///le o char
        current->lname=malloc(sizeof(char));       ///aloca espaço para o char
        strcpy(current->lname,&c);              ///copia o char para a estrutura
        previous=current;

        if(i==pcs->lin-1)
        {
            current=NULL;
        }
        else
            current=malloc(sizeof(LETTER));
        previous->pnext=current;
      }
    }
  }
  else
    printf("Erro ao abrir arquivo!");
  fclose(fp);
}

each letter must have 8 pointers around it.

That means your letter structure should be something like

struct letter {
    struct letter  *up_left;
    struct letter  *up;
    struct letter  *up_right;
    struct letter  *left;
    struct letter  *right;
    struct letter  *down_left;
    struct letter  *down;
    struct letter  *down_right;
    int             letter;
};

You don't need the letter soup either. Because you read the characters in order, you can read them directly into the graph. The trick is that you'll want to keep one struct letter pointer to the top left letter in the graph; one struct letter pointer to the first letter on each row; and one struct letter pointer for each new letter you add.

Here is the logic in pseudocode :

Function ReadGraph(input):

    Let  topleft  = NULL     # Top left letter in the graph
    Let  leftmost = NULL     # Leftmost letter in current line
    Let  previous = NULL     # Previous letter in current line
    Let  current  = NULL     # Current letter
    Let  letter = ''

    Do:
        Read next letter from input
    While (letter is not a letter nor EOF)
    If letter is EOF:
        # No letters at all in the input, so no graph either.
        Return NULL
    End If

    topleft = new struct letter (letter)
    leftmost = topleft
    current = topleft

    # Row loop. First letter is already in current.
    Loop:

        # Loop over letters in the current line
        Loop:
            Read new letter from input
            If letter is EOF, or newline:
                Break
            End If

            previous = current
            current = new struct letter

            current->left = previous
            previous->right = current

            If current->left->up is not NULL:
                current->up_left = current->left->up
                current->up_left->down_right = current

                If current->up_left->right is not NULL:
                    current->up = current->up_left->right
                    current->up->down = current

                    If current->up->right is not NULL:
                        current->up_right = current->up->right
                        current->up_right->down_left = current
                    End If
                End If
            End If

        End Loop

        If letter is not EOF:
            While (letter is not EOF) and (letter is not a letter):
                Read new letter from input
            End While
        End If
        If letter is EOF:
            Break
        End If

        # We have a first letter on a new line.
        current = new letter structure

        current->up = leftmost
        leftmost->down = current

        If current->up->right is not NULL:
            current->up_right = current->up->right
            current->up_right->down_left = current
        End If

        leftmost = current

    End Loop

    Return topleft
End Function

Note how the first character in the input stream is handled differently (at the very beginning), and how the first character on each subsequent line is handled differently (near the end of the function). This may feel logically or structurally odd, but doing it this way keeps the code simple.

Also note how the bidirectional links are constructed. Because we read from top to bottom, left to right, we establish the link left first, then up-left, then up, then up-right; with the backwards link immediately after the forward link.

This requires a bit of thought, to understand why it works. Consider:

  up_left │  up  │   up_right
──────────┼──────┼───────────
     left │ curr │      right
──────────┼──────┼───────────
down_left │ down │ down_right

When we are constructing curr , we know if left exists or not, because we handle the first letter on each line separately.

If curr->left is non-NULL, and curr->left->up is non-NULL, we know there was a previous line, and we can point curr->up_left to point to it. Its ->down_right should point back to curr , of course, for the links to be consistent.

If curr->up_left is non-NULL, and curr->up_left->right is non-NULL, we know the previous line had a letter in the same column. We can set curr->up to point to it, and its ->down to point back to curr .

If curr->up is non-NULL, and curr->up->right is non-NULL, we know the previous line had a letter in the next column. We can set curr->up_right to point to it, and its ->down_left to point back to curr .

Now, because we read each line from left to right, all columns on each line are filled up to the rightmost column. If you proceed using the above logic, you'll find out that the second line fills the rest of the links from first lines letters to the second line letters, and so on.

It also means that if the input file contained a special character, say '*' for a non-letter node, you should create those while constructing the graph, just like they were ordinary letters, to ensure the above logic of linking works.

After the entire graph is read, you can then remove those non-letter nodes from the graph, one by one. To remove a node, you first set the back links to it (from its neighboring letters) to NULL, then free() it.

I personally "poison" the structure before free() ing it, setting letter to a known impossible value ( WEOF , for wide end-of-input), and all links to NULL , so that if some other code uses the structure after it was freed (which would be an use after free bug ), for example because it cached the pointer somehow, it is easier to detect.

(When you free() a pointer, the C library usually does not return it immediately to the operating system, or clear it; usually, the dynamically allocated region is just added to the internal free heap, so that a future allocation can just reuse that memory. Unfortunately, it means that if you do not "poison" freed structures, sometimes they can still be accessible afterwards. Such use-after-free bugs are very annoying, and it is definitely worth the "unnecessary work" of poisoning the structures just to help debugging those.)

To facilitate the poisoning, and also to make it easy to remove the poisoning if it turns out to be unnecessary slowdown at some point, it is best to use helper functions for creating and destroying the structures:

static inline struct letter *new_letter(const int letter)
{
    struct letter *one;

    one = malloc(sizeof *one);
    if (!one) {
        fprintf(stderr, "new_letter(): Out of memory.\n");
        exit(EXIT_FAILURE);
    }

    one->up_left    = NULL;
    one->up         = NULL;
    one->up_right   = NULL;
    one->left       = NULL;
    one->right      = NULL;
    one->down_left  = NULL;
    one->down       = NULL;
    one->down_right = NULL;

    one->letter = letter;

    return one;
}

static inline void free_letter(struct letter *one)
{
    if (one) {
        one->up_left    = NULL;
        one->up         = NULL;
        one->up_right   = NULL;
        one->left       = NULL;
        one->right      = NULL;
        one->down_left  = NULL;
        one->down       = NULL;
        one->down_right = NULL;
        one->letter     = EOF;
        free(one);
    }
}

I usually include these functions in the header file that defines struct letter ; because they are then tiny macro-like functions, I mark them static inline , telling the C compiler that they only need to be accessible in the same compilation unit, and that it does not need to generate the functions and call to those functions, but can inline the code into wherever they are called.


Personally, I wrote and verified the above pseudocode using

#include <stdlib.h>
#include <locale.h>
#include <wchar.h>
#include <stdio.h>

struct letter {
    struct letter  *chain;  /* Internal chain of all known letters */

    struct letter  *up_left;
    struct letter  *up;
    struct letter  *up_right;
    struct letter  *left;
    struct letter  *right;
    struct letter  *down_left;
    struct letter  *down;
    struct letter  *down_right;

    wint_t          letter;
};

static struct letter *all_letters = NULL;

struct letter *new_letter(wint_t letter)
{
    struct letter *one;

    one = malloc(sizeof *one);
    if (!one) {
        fprintf(stderr, "new_letter(): Out of memory.\n");
        exit(EXIT_FAILURE);
    }

    one->letter = letter;

    one->chain = all_letters;
    all_letters = one;

    one->up_left    = NULL;
    one->up         = NULL;
    one->up_right   = NULL;
    one->left       = NULL;
    one->right      = NULL;
    one->down_left  = NULL;
    one->down       = NULL;
    one->down_right = NULL;

    return one;
}

I prefer to use wide input, because in conforming operating systems you can use any glyphs your locale treats as letters, not just ASCII AZ. All you need to do, is have

    if (!setlocale(LC_ALL, ""))
        fprintf(stderr, "Warning: Current locale is not supported by your C library.\n");
    if (fwide(stdin, 1) < 1)
        fprintf(stderr, "Warning: Wide standard input is not supported by your C library for current locale.\n");
    if (fwide(stdout, 1) < 1)
        fprintf(stderr, "Warning: Wide standard output is not supported by your C library for current locale.\n");

at the start of your main() , and use the wide I/O functions ( fwprintf() , fgetwc() , and so on), assuming you have a standard C environment. (Apparently, some Windows users have issues with UTF-8 support in Windows. Complain to Microsoft; the above behaviour is per the C standard.)

The chain member is used to link all created letters into a single linked list, so that we can use a function (below) to draw the entire graph in Graphviz Dot language. ( Graphviz is available for all operating systems, and in my opinion, is an excellent tool when developing or debugging code that uses linked lists or graphs.) The circo utility seems to be quite good at drawing such graphs, too.

int letter_graph(FILE *out)
{
    struct letter  *one;

    /* Sanity check. */
    if (!out || ferror(out))
        return -1;

    /* Wide output. */
    if (fwide(out) < 1)
        return -1;

    fwprintf(out, L"digraph {\n");
    for (one = all_letters; one != NULL; one = one->chain) {
        fwprintf(out, L"    \"%p\" [ label=\"%lc\" ];\n",
                      (void *)one, one->letter);
        if (one->up_left)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↖\" ];\n",
                          (void *)one, (void *)(one->up_left));
        if (one->up)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↑\" ];\n",
                          (void *)one, (void *)(one->up));
        if (one->up_right)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↗\" ];\n",
                          (void *)one, (void *)(one->up_right));
        if (one->left)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"←\" ];\n",
                          (void *)one, (void *)(one->left));
        if (one->right)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"→\" ];\n",
                          (void *)one, (void *)(one->right));
        if (one->down_left)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↙\" ];\n",
                          (void *)one, (void *)(one->down_left));
        if (one->down)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↓\" ];\n",
                          (void *)one, (void *)(one->down));
        if (one->down_right)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↘\" ];\n",
                         (void *)one, (void *)(one->down_right));
    }
    fwprintf(out, L"}\n");

    return 0;
}

If the input file is

ABC
DEF
GHI

the Dot description of the graph is

digraph {
    "0x1c542f0" [ label="I" ];
    "0x1c542f0" -> "0x1c54170" [ label="↖" ];
    "0x1c542f0" -> "0x1c541d0" [ label="↑" ];
    "0x1c542f0" -> "0x1c54290" [ label="←" ];
    "0x1c54290" [ label="H" ];
    "0x1c54290" -> "0x1c54110" [ label="↖" ];
    "0x1c54290" -> "0x1c54170" [ label="↑" ];
    "0x1c54290" -> "0x1c541d0" [ label="↗" ];
    "0x1c54290" -> "0x1c54230" [ label="←" ];
    "0x1c54290" -> "0x1c542f0" [ label="→" ];
    "0x1c54230" [ label="G" ];
    "0x1c54230" -> "0x1c54110" [ label="↑" ];
    "0x1c54230" -> "0x1c54170" [ label="↗" ];
    "0x1c54230" -> "0x1c54290" [ label="→" ];
    "0x1c541d0" [ label="F" ];
    "0x1c541d0" -> "0x1c54050" [ label="↖" ];
    "0x1c541d0" -> "0x1c540b0" [ label="↑" ];
    "0x1c541d0" -> "0x1c54170" [ label="←" ];
    "0x1c541d0" -> "0x1c54290" [ label="↙" ];
    "0x1c541d0" -> "0x1c542f0" [ label="↓" ];
    "0x1c54170" [ label="E" ];
    "0x1c54170" -> "0x1c53ff0" [ label="↖" ];
    "0x1c54170" -> "0x1c54050" [ label="↑" ];
    "0x1c54170" -> "0x1c540b0" [ label="↗" ];
    "0x1c54170" -> "0x1c54110" [ label="←" ];
    "0x1c54170" -> "0x1c541d0" [ label="→" ];
    "0x1c54170" -> "0x1c54230" [ label="↙" ];
    "0x1c54170" -> "0x1c54290" [ label="↓" ];
    "0x1c54170" -> "0x1c542f0" [ label="↘" ];
    "0x1c54110" [ label="D" ];
    "0x1c54110" -> "0x1c53ff0" [ label="↑" ];
    "0x1c54110" -> "0x1c54050" [ label="↗" ];
    "0x1c54110" -> "0x1c54170" [ label="→" ];
    "0x1c54110" -> "0x1c54230" [ label="↓" ];
    "0x1c54110" -> "0x1c54290" [ label="↘" ];
    "0x1c540b0" [ label="C" ];
    "0x1c540b0" -> "0x1c54050" [ label="←" ];
    "0x1c540b0" -> "0x1c54170" [ label="↙" ];
    "0x1c540b0" -> "0x1c541d0" [ label="↓" ];
    "0x1c54050" [ label="B" ];
    "0x1c54050" -> "0x1c53ff0" [ label="←" ];
    "0x1c54050" -> "0x1c540b0" [ label="→" ];
    "0x1c54050" -> "0x1c54110" [ label="↙" ];
    "0x1c54050" -> "0x1c54170" [ label="↓" ];
    "0x1c54050" -> "0x1c541d0" [ label="↘" ];
    "0x1c53ff0" [ label="A" ];
    "0x1c53ff0" -> "0x1c54050" [ label="→" ];
    "0x1c53ff0" -> "0x1c54110" [ label="↓" ];
    "0x1c53ff0" -> "0x1c54170" [ label="↘" ];
}

(It is in reverse order because I insert each new letter at the beginning of the linked list). circo draws the following graph from that:

3×3字母网格,8路链接

During development, I also check if the linkage is consistent:

    for (one = all_letters; one != NULL; one = one->chain) {

        if (one->up_left && one->up_left->down_right != one)
            fprintf(stderr, "'%c'->up_left is broken!\n", one->letter);
        if (one->up && one->up->down != one)
            fprintf(stderr, "'%c'->up is broken!\n", one->letter);
        if (one->up_right && one->up_right->down_left != one)
            fprintf(stderr, "'%c'->up_right is broken!\n", one->letter);
        if (one->left && one->left->right != one)
            fprintf(stderr, "'%c'->left is broken!\n", one->letter);
        if (one->right && one->right->left != one)
            fprintf(stderr, "'%c'->right is broken!\n", one->letter);
        if (one->down_left && one->down_left->up_right != one)
            fprintf(stderr, "'%c'->down_left is broken!\n", one->letter);
        if (one->down && one->down->up != one)
            fprintf(stderr, "'%c'->down is broken!\n", one->letter);
        if (one->down_right && one->down_right->up_left != one)
            fprintf(stderr, "'%c'->down_right is broken!\n", one->letter);
    }

By consistent linkage, I mean that if a->left == b , then b->right == a . Of course, the check cannot tell if a->left or b->right is wrong; it can only detect if they are consistent or not.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM