简体   繁体   English

如何使用指针将矩阵文本文件读取到C中的链接列表中?

[英]How can i read a matrix text file into a linked list in C using pointers?

I have to read a text file of letters using matrix linked lists, where each letter must have 8 pointers around it. 我必须使用矩阵链接列表读取字母的文本文件,其中每个字母周围必须有8个指针。

Here is what i have to do: enter image description here 这是我要做的: 在此处输入图像描述

The text file is this: 文本文件是这样的:

JDCPCPXOAA
ZXVOVXFRVV
NDLEIRBIEA
YTRQOMOIIO
FZZAPXERTQ
XAUEOEOOTO
PORTUOAZLZ
CZNOQUPUOP

In my code i only can read the letters in the first line. 在我的代码中,我只能阅读第一行中的字母。

Can someone help me? 有人能帮我吗?

typedef struct letter           ///estrutura para cada letra da sopa
{
    char *lname;
    struct letter *pnext;
}LETTER;

typedef struct soup         ///estrutura para a sopa de letras
{
    int lin;
    int col;
    LETTER *pfirst;
}SOUP;

void read_soup_txt(SOUP *pcs,char *fn,int lin,int col)
{
  FILE *fp;
  fp=fopen(fn,"r");
  char c;
  if(fp!=NULL)
  {
    pcs->lin=lin;
    pcs->col=col;
    LETTER *current=malloc(sizeof(LETTER)),*previous;
    pcs->pfirst=current;

    for(int i=0;i<pcs->lin;i++)     ///linhas
    {
      for(int j=0;j<pcs->col;j++)     ///colunas
      {
        fscanf(fp,"%c",&c);                     ///le o char
        current->lname=malloc(sizeof(char));       ///aloca espaço para o char
        strcpy(current->lname,&c);              ///copia o char para a estrutura
        previous=current;

        if(i==pcs->lin-1)
        {
            current=NULL;
        }
        else
            current=malloc(sizeof(LETTER));
        previous->pnext=current;
      }
    }
  }
  else
    printf("Erro ao abrir arquivo!");
  fclose(fp);
}

each letter must have 8 pointers around it. 每个字母周围必须有8个指针。

That means your letter structure should be something like 这意味着您的字母结构应类似于

struct letter {
    struct letter  *up_left;
    struct letter  *up;
    struct letter  *up_right;
    struct letter  *left;
    struct letter  *right;
    struct letter  *down_left;
    struct letter  *down;
    struct letter  *down_right;
    int             letter;
};

You don't need the letter soup either. 您也不需要字母汤。 Because you read the characters in order, you can read them directly into the graph. 因为您按顺序读取字符,所以可以将它们直接读取到图形中。 The trick is that you'll want to keep one struct letter pointer to the top left letter in the graph; 诀窍在于,您将需要保持一个struct letter指针指向图中左上角的字母; one struct letter pointer to the first letter on each row; 一个struct letter指针,指向每行的第一个字母; and one struct letter pointer for each new letter you add. 每个添加的新字母都有一个struct letter指针。

Here is the logic in pseudocode : 这是伪代码中的逻辑:

Function ReadGraph(input):

    Let  topleft  = NULL     # Top left letter in the graph
    Let  leftmost = NULL     # Leftmost letter in current line
    Let  previous = NULL     # Previous letter in current line
    Let  current  = NULL     # Current letter
    Let  letter = ''

    Do:
        Read next letter from input
    While (letter is not a letter nor EOF)
    If letter is EOF:
        # No letters at all in the input, so no graph either.
        Return NULL
    End If

    topleft = new struct letter (letter)
    leftmost = topleft
    current = topleft

    # Row loop. First letter is already in current.
    Loop:

        # Loop over letters in the current line
        Loop:
            Read new letter from input
            If letter is EOF, or newline:
                Break
            End If

            previous = current
            current = new struct letter

            current->left = previous
            previous->right = current

            If current->left->up is not NULL:
                current->up_left = current->left->up
                current->up_left->down_right = current

                If current->up_left->right is not NULL:
                    current->up = current->up_left->right
                    current->up->down = current

                    If current->up->right is not NULL:
                        current->up_right = current->up->right
                        current->up_right->down_left = current
                    End If
                End If
            End If

        End Loop

        If letter is not EOF:
            While (letter is not EOF) and (letter is not a letter):
                Read new letter from input
            End While
        End If
        If letter is EOF:
            Break
        End If

        # We have a first letter on a new line.
        current = new letter structure

        current->up = leftmost
        leftmost->down = current

        If current->up->right is not NULL:
            current->up_right = current->up->right
            current->up_right->down_left = current
        End If

        leftmost = current

    End Loop

    Return topleft
End Function

Note how the first character in the input stream is handled differently (at the very beginning), and how the first character on each subsequent line is handled differently (near the end of the function). 注意输入流中的第一个字符如何被不同地处理(在开始时),以及每个后续行中的第一个字符如何被不同地处理(在函数的结尾附近)。 This may feel logically or structurally odd, but doing it this way keeps the code simple. 从逻辑上或结构上来看,这可能感觉很奇怪,但是这样做可以使代码保持简单。

Also note how the bidirectional links are constructed. 还要注意双向链接是如何构造的。 Because we read from top to bottom, left to right, we establish the link left first, then up-left, then up, then up-right; 因为我们是从上到下,从左到右阅读的,所以我们先建立链接,然后是左起,然后是左起,然后是上,然后是右起; with the backwards link immediately after the forward link. 在前向链接之后紧接向后链接。

This requires a bit of thought, to understand why it works. 这需要一些思考,以了解其工作原理。 Consider: 考虑:

  up_left │  up  │   up_right
──────────┼──────┼───────────
     left │ curr │      right
──────────┼──────┼───────────
down_left │ down │ down_right

When we are constructing curr , we know if left exists or not, because we handle the first letter on each line separately. 在构造curr ,我们知道left存在,因为我们分别处理每行的第一个字母。

If curr->left is non-NULL, and curr->left->up is non-NULL, we know there was a previous line, and we can point curr->up_left to point to it. 如果curr->left为非NULL,并且curr->left->up为非NULL,则我们知道有前一行,我们可以将curr->up_left指向该行。 Its ->down_right should point back to curr , of course, for the links to be consistent. 当然,它的->down_right应该指向curr ,以使链接保持一致。

If curr->up_left is non-NULL, and curr->up_left->right is non-NULL, we know the previous line had a letter in the same column. 如果curr->up_left为非NULL,而curr->up_left->right为非NULL,则我们知道上一行在同一列中有一个字母。 We can set curr->up to point to it, and its ->down to point back to curr . 我们可以将curr->up设置为指向它,而将其->down为指向curr

If curr->up is non-NULL, and curr->up->right is non-NULL, we know the previous line had a letter in the next column. 如果curr->up为非NULL,而curr->up->right为非NULL,则我们知道上一行在下一列中有一个字母。 We can set curr->up_right to point to it, and its ->down_left to point back to curr . 我们可以将curr->up_right设置为指向它,而将其->down_left为指向curr

Now, because we read each line from left to right, all columns on each line are filled up to the rightmost column. 现在,因为我们从左到右读取每一行,所以每一行的所有列都被填充到最右边的列。 If you proceed using the above logic, you'll find out that the second line fills the rest of the links from first lines letters to the second line letters, and so on. 如果继续使用上述逻辑,您将发现第二行填充了从第一行字母到第二行字母的其余链接,依此类推。

It also means that if the input file contained a special character, say '*' for a non-letter node, you should create those while constructing the graph, just like they were ordinary letters, to ensure the above logic of linking works. 这也意味着,如果输入文件包含一个特殊字符,对于非字母节点说'*' ,则应在构造图形时创建它们,就像它们是普通字母一样,以确保上述链接逻辑有效。

After the entire graph is read, you can then remove those non-letter nodes from the graph, one by one. 读取整个图之后,您可以从图中一一删除那些非字母节点。 To remove a node, you first set the back links to it (from its neighboring letters) to NULL, then free() it. 要删除节点,您首先需要将节点的反向链接(从其相邻字母开始)设置为NULL,然后对其进行free()

I personally "poison" the structure before free() ing it, setting letter to a known impossible value ( WEOF , for wide end-of-input), and all links to NULL , so that if some other code uses the structure after it was freed (which would be an use after free bug ), for example because it cached the pointer somehow, it is easier to detect. 我本人在free()结构之前“毒化”该结构,将letter设置为一个已知的不可能值( WEOF ,用于宽输入范围),并且所有链接都为NULL ,因此,如果其他一些代码在其后使用该结构被释放( 在释放bug之后将被使用 ),例如,因为它以某种方式缓存了指针,因此更易于检测。

(When you free() a pointer, the C library usually does not return it immediately to the operating system, or clear it; usually, the dynamically allocated region is just added to the internal free heap, so that a future allocation can just reuse that memory. Unfortunately, it means that if you do not "poison" freed structures, sometimes they can still be accessible afterwards. Such use-after-free bugs are very annoying, and it is definitely worth the "unnecessary work" of poisoning the structures just to help debugging those.) (当您free()指针的free() ,C库通常不会立即将其返回给操作系统或将其清除;通常,动态分配的区域只会添加到内部空闲堆中,以便将来的分配可以重用不幸的是,这意味着如果您不对释放的结构进行“毒化”,有时它们之后仍然可以访问。这种“用后释放”的bug非常烦人,绝对值得“中毒”的“不必要的工作”。结构只是为了帮助调试它们。)

To facilitate the poisoning, and also to make it easy to remove the poisoning if it turns out to be unnecessary slowdown at some point, it is best to use helper functions for creating and destroying the structures: 为了促进中毒,并且在某些情况下,如果不必要地放慢速度,也可以轻松消除中毒,最好使用辅助函数来创建和销毁结构:

static inline struct letter *new_letter(const int letter)
{
    struct letter *one;

    one = malloc(sizeof *one);
    if (!one) {
        fprintf(stderr, "new_letter(): Out of memory.\n");
        exit(EXIT_FAILURE);
    }

    one->up_left    = NULL;
    one->up         = NULL;
    one->up_right   = NULL;
    one->left       = NULL;
    one->right      = NULL;
    one->down_left  = NULL;
    one->down       = NULL;
    one->down_right = NULL;

    one->letter = letter;

    return one;
}

static inline void free_letter(struct letter *one)
{
    if (one) {
        one->up_left    = NULL;
        one->up         = NULL;
        one->up_right   = NULL;
        one->left       = NULL;
        one->right      = NULL;
        one->down_left  = NULL;
        one->down       = NULL;
        one->down_right = NULL;
        one->letter     = EOF;
        free(one);
    }
}

I usually include these functions in the header file that defines struct letter ; 我通常在定义struct letter的头文件中包含这些函数; because they are then tiny macro-like functions, I mark them static inline , telling the C compiler that they only need to be accessible in the same compilation unit, and that it does not need to generate the functions and call to those functions, but can inline the code into wherever they are called. 因为它们是微小的类似于宏的函数,所以我将它们标记为static inline ,告诉C编译器它们只需要在同一编译单元中就可以访问,并且不需要生成函数并调用这些函数,但是可以将代码内联到调用它们的任何位置。


Personally, I wrote and verified the above pseudocode using 我个人使用以下代码编写并验证了上述伪代码

#include <stdlib.h>
#include <locale.h>
#include <wchar.h>
#include <stdio.h>

struct letter {
    struct letter  *chain;  /* Internal chain of all known letters */

    struct letter  *up_left;
    struct letter  *up;
    struct letter  *up_right;
    struct letter  *left;
    struct letter  *right;
    struct letter  *down_left;
    struct letter  *down;
    struct letter  *down_right;

    wint_t          letter;
};

static struct letter *all_letters = NULL;

struct letter *new_letter(wint_t letter)
{
    struct letter *one;

    one = malloc(sizeof *one);
    if (!one) {
        fprintf(stderr, "new_letter(): Out of memory.\n");
        exit(EXIT_FAILURE);
    }

    one->letter = letter;

    one->chain = all_letters;
    all_letters = one;

    one->up_left    = NULL;
    one->up         = NULL;
    one->up_right   = NULL;
    one->left       = NULL;
    one->right      = NULL;
    one->down_left  = NULL;
    one->down       = NULL;
    one->down_right = NULL;

    return one;
}

I prefer to use wide input, because in conforming operating systems you can use any glyphs your locale treats as letters, not just ASCII AZ. 我更喜欢使用宽输入,因为在兼容的操作系统中,您可以将您的语言环境视为字母的任何字形,而不仅仅是ASCII AZ。 All you need to do, is have 您需要做的就是

    if (!setlocale(LC_ALL, ""))
        fprintf(stderr, "Warning: Current locale is not supported by your C library.\n");
    if (fwide(stdin, 1) < 1)
        fprintf(stderr, "Warning: Wide standard input is not supported by your C library for current locale.\n");
    if (fwide(stdout, 1) < 1)
        fprintf(stderr, "Warning: Wide standard output is not supported by your C library for current locale.\n");

at the start of your main() , and use the wide I/O functions ( fwprintf() , fgetwc() , and so on), assuming you have a standard C environment. main()的开头,并使用广泛的I / O函数( fwprintf()fgetwc()等),假设您具有标准的C环境。 (Apparently, some Windows users have issues with UTF-8 support in Windows. Complain to Microsoft; the above behaviour is per the C standard.) (显然,某些Windows用户在Windows中对UTF-8支持存在问题。请向Microsoft投诉;以上行为是基于C标准的。)

The chain member is used to link all created letters into a single linked list, so that we can use a function (below) to draw the entire graph in Graphviz Dot language. chain成员用于将所有创建的字母链接到一个链接列表中,因此我们可以使用一个函数(如下)以Graphviz Dot语言绘制整个图形。 ( Graphviz is available for all operating systems, and in my opinion, is an excellent tool when developing or debugging code that uses linked lists or graphs.) The circo utility seems to be quite good at drawing such graphs, too. Graphviz可用于所有操作系统,在我看来,这是开发或调试使用链表或图形的代码时的出色工具。) circo实用程序似乎也很擅长绘制此类图形。

int letter_graph(FILE *out)
{
    struct letter  *one;

    /* Sanity check. */
    if (!out || ferror(out))
        return -1;

    /* Wide output. */
    if (fwide(out) < 1)
        return -1;

    fwprintf(out, L"digraph {\n");
    for (one = all_letters; one != NULL; one = one->chain) {
        fwprintf(out, L"    \"%p\" [ label=\"%lc\" ];\n",
                      (void *)one, one->letter);
        if (one->up_left)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↖\" ];\n",
                          (void *)one, (void *)(one->up_left));
        if (one->up)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↑\" ];\n",
                          (void *)one, (void *)(one->up));
        if (one->up_right)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↗\" ];\n",
                          (void *)one, (void *)(one->up_right));
        if (one->left)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"←\" ];\n",
                          (void *)one, (void *)(one->left));
        if (one->right)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"→\" ];\n",
                          (void *)one, (void *)(one->right));
        if (one->down_left)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↙\" ];\n",
                          (void *)one, (void *)(one->down_left));
        if (one->down)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↓\" ];\n",
                          (void *)one, (void *)(one->down));
        if (one->down_right)
            fwprintf(out, L"    \"%p\" -> \"%p\" [ label=\"↘\" ];\n",
                         (void *)one, (void *)(one->down_right));
    }
    fwprintf(out, L"}\n");

    return 0;
}

If the input file is 如果输入文件是

ABC
DEF
GHI

the Dot description of the graph is 图的点描述为

digraph {
    "0x1c542f0" [ label="I" ];
    "0x1c542f0" -> "0x1c54170" [ label="↖" ];
    "0x1c542f0" -> "0x1c541d0" [ label="↑" ];
    "0x1c542f0" -> "0x1c54290" [ label="←" ];
    "0x1c54290" [ label="H" ];
    "0x1c54290" -> "0x1c54110" [ label="↖" ];
    "0x1c54290" -> "0x1c54170" [ label="↑" ];
    "0x1c54290" -> "0x1c541d0" [ label="↗" ];
    "0x1c54290" -> "0x1c54230" [ label="←" ];
    "0x1c54290" -> "0x1c542f0" [ label="→" ];
    "0x1c54230" [ label="G" ];
    "0x1c54230" -> "0x1c54110" [ label="↑" ];
    "0x1c54230" -> "0x1c54170" [ label="↗" ];
    "0x1c54230" -> "0x1c54290" [ label="→" ];
    "0x1c541d0" [ label="F" ];
    "0x1c541d0" -> "0x1c54050" [ label="↖" ];
    "0x1c541d0" -> "0x1c540b0" [ label="↑" ];
    "0x1c541d0" -> "0x1c54170" [ label="←" ];
    "0x1c541d0" -> "0x1c54290" [ label="↙" ];
    "0x1c541d0" -> "0x1c542f0" [ label="↓" ];
    "0x1c54170" [ label="E" ];
    "0x1c54170" -> "0x1c53ff0" [ label="↖" ];
    "0x1c54170" -> "0x1c54050" [ label="↑" ];
    "0x1c54170" -> "0x1c540b0" [ label="↗" ];
    "0x1c54170" -> "0x1c54110" [ label="←" ];
    "0x1c54170" -> "0x1c541d0" [ label="→" ];
    "0x1c54170" -> "0x1c54230" [ label="↙" ];
    "0x1c54170" -> "0x1c54290" [ label="↓" ];
    "0x1c54170" -> "0x1c542f0" [ label="↘" ];
    "0x1c54110" [ label="D" ];
    "0x1c54110" -> "0x1c53ff0" [ label="↑" ];
    "0x1c54110" -> "0x1c54050" [ label="↗" ];
    "0x1c54110" -> "0x1c54170" [ label="→" ];
    "0x1c54110" -> "0x1c54230" [ label="↓" ];
    "0x1c54110" -> "0x1c54290" [ label="↘" ];
    "0x1c540b0" [ label="C" ];
    "0x1c540b0" -> "0x1c54050" [ label="←" ];
    "0x1c540b0" -> "0x1c54170" [ label="↙" ];
    "0x1c540b0" -> "0x1c541d0" [ label="↓" ];
    "0x1c54050" [ label="B" ];
    "0x1c54050" -> "0x1c53ff0" [ label="←" ];
    "0x1c54050" -> "0x1c540b0" [ label="→" ];
    "0x1c54050" -> "0x1c54110" [ label="↙" ];
    "0x1c54050" -> "0x1c54170" [ label="↓" ];
    "0x1c54050" -> "0x1c541d0" [ label="↘" ];
    "0x1c53ff0" [ label="A" ];
    "0x1c53ff0" -> "0x1c54050" [ label="→" ];
    "0x1c53ff0" -> "0x1c54110" [ label="↓" ];
    "0x1c53ff0" -> "0x1c54170" [ label="↘" ];
}

(It is in reverse order because I insert each new letter at the beginning of the linked list). (这是相反的顺序,因为我在链接列表的开头插入了每个新字母)。 circo draws the following graph from that: circo从中得出以下图形:

3×3字母网格,8路链接

During development, I also check if the linkage is consistent: 在开发期间,我还要检查链接是否一致:

    for (one = all_letters; one != NULL; one = one->chain) {

        if (one->up_left && one->up_left->down_right != one)
            fprintf(stderr, "'%c'->up_left is broken!\n", one->letter);
        if (one->up && one->up->down != one)
            fprintf(stderr, "'%c'->up is broken!\n", one->letter);
        if (one->up_right && one->up_right->down_left != one)
            fprintf(stderr, "'%c'->up_right is broken!\n", one->letter);
        if (one->left && one->left->right != one)
            fprintf(stderr, "'%c'->left is broken!\n", one->letter);
        if (one->right && one->right->left != one)
            fprintf(stderr, "'%c'->right is broken!\n", one->letter);
        if (one->down_left && one->down_left->up_right != one)
            fprintf(stderr, "'%c'->down_left is broken!\n", one->letter);
        if (one->down && one->down->up != one)
            fprintf(stderr, "'%c'->down is broken!\n", one->letter);
        if (one->down_right && one->down_right->up_left != one)
            fprintf(stderr, "'%c'->down_right is broken!\n", one->letter);
    }

By consistent linkage, I mean that if a->left == b , then b->right == a . 通过一致的链接,我的意思是如果a->left == b ,那么b->right == a Of course, the check cannot tell if a->left or b->right is wrong; 当然,检查不能判断a->leftb->right是错误的。 it can only detect if they are consistent or not. 它只能检测它们是否一致。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM