I have to read a text file of letters using matrix linked lists, where each letter must have 8 pointers around it.
Here is what i have to do: enter image description here
The text file is this:
JDCPCPXOAA
ZXVOVXFRVV
NDLEIRBIEA
YTRQOMOIIO
FZZAPXERTQ
XAUEOEOOTO
PORTUOAZLZ
CZNOQUPUOP
In my code i only can read the letters in the first line.
Can someone help me?
typedef struct letter ///estrutura para cada letra da sopa
{
char *lname;
struct letter *pnext;
}LETTER;
typedef struct soup ///estrutura para a sopa de letras
{
int lin;
int col;
LETTER *pfirst;
}SOUP;
void read_soup_txt(SOUP *pcs,char *fn,int lin,int col)
{
FILE *fp;
fp=fopen(fn,"r");
char c;
if(fp!=NULL)
{
pcs->lin=lin;
pcs->col=col;
LETTER *current=malloc(sizeof(LETTER)),*previous;
pcs->pfirst=current;
for(int i=0;i<pcs->lin;i++) ///linhas
{
for(int j=0;j<pcs->col;j++) ///colunas
{
fscanf(fp,"%c",&c); ///le o char
current->lname=malloc(sizeof(char)); ///aloca espaço para o char
strcpy(current->lname,&c); ///copia o char para a estrutura
previous=current;
if(i==pcs->lin-1)
{
current=NULL;
}
else
current=malloc(sizeof(LETTER));
previous->pnext=current;
}
}
}
else
printf("Erro ao abrir arquivo!");
fclose(fp);
}
each letter must have 8 pointers around it.
That means your letter structure should be something like
struct letter {
struct letter *up_left;
struct letter *up;
struct letter *up_right;
struct letter *left;
struct letter *right;
struct letter *down_left;
struct letter *down;
struct letter *down_right;
int letter;
};
You don't need the letter soup either. Because you read the characters in order, you can read them directly into the graph. The trick is that you'll want to keep one struct letter
pointer to the top left letter in the graph; one struct letter
pointer to the first letter on each row; and one struct letter
pointer for each new letter you add.
Here is the logic in pseudocode :
Function ReadGraph(input):
Let topleft = NULL # Top left letter in the graph
Let leftmost = NULL # Leftmost letter in current line
Let previous = NULL # Previous letter in current line
Let current = NULL # Current letter
Let letter = ''
Do:
Read next letter from input
While (letter is not a letter nor EOF)
If letter is EOF:
# No letters at all in the input, so no graph either.
Return NULL
End If
topleft = new struct letter (letter)
leftmost = topleft
current = topleft
# Row loop. First letter is already in current.
Loop:
# Loop over letters in the current line
Loop:
Read new letter from input
If letter is EOF, or newline:
Break
End If
previous = current
current = new struct letter
current->left = previous
previous->right = current
If current->left->up is not NULL:
current->up_left = current->left->up
current->up_left->down_right = current
If current->up_left->right is not NULL:
current->up = current->up_left->right
current->up->down = current
If current->up->right is not NULL:
current->up_right = current->up->right
current->up_right->down_left = current
End If
End If
End If
End Loop
If letter is not EOF:
While (letter is not EOF) and (letter is not a letter):
Read new letter from input
End While
End If
If letter is EOF:
Break
End If
# We have a first letter on a new line.
current = new letter structure
current->up = leftmost
leftmost->down = current
If current->up->right is not NULL:
current->up_right = current->up->right
current->up_right->down_left = current
End If
leftmost = current
End Loop
Return topleft
End Function
Note how the first character in the input stream is handled differently (at the very beginning), and how the first character on each subsequent line is handled differently (near the end of the function). This may feel logically or structurally odd, but doing it this way keeps the code simple.
Also note how the bidirectional links are constructed. Because we read from top to bottom, left to right, we establish the link left first, then up-left, then up, then up-right; with the backwards link immediately after the forward link.
This requires a bit of thought, to understand why it works. Consider:
up_left │ up │ up_right
──────────┼──────┼───────────
left │ curr │ right
──────────┼──────┼───────────
down_left │ down │ down_right
When we are constructing curr
, we know if left
exists or not, because we handle the first letter on each line separately.
If curr->left
is non-NULL, and curr->left->up
is non-NULL, we know there was a previous line, and we can point curr->up_left
to point to it. Its ->down_right
should point back to curr
, of course, for the links to be consistent.
If curr->up_left
is non-NULL, and curr->up_left->right
is non-NULL, we know the previous line had a letter in the same column. We can set curr->up
to point to it, and its ->down
to point back to curr
.
If curr->up
is non-NULL, and curr->up->right
is non-NULL, we know the previous line had a letter in the next column. We can set curr->up_right
to point to it, and its ->down_left
to point back to curr
.
Now, because we read each line from left to right, all columns on each line are filled up to the rightmost column. If you proceed using the above logic, you'll find out that the second line fills the rest of the links from first lines letters to the second line letters, and so on.
It also means that if the input file contained a special character, say '*'
for a non-letter node, you should create those while constructing the graph, just like they were ordinary letters, to ensure the above logic of linking works.
After the entire graph is read, you can then remove those non-letter nodes from the graph, one by one. To remove a node, you first set the back links to it (from its neighboring letters) to NULL, then free()
it.
I personally "poison" the structure before free()
ing it, setting letter
to a known impossible value ( WEOF
, for wide end-of-input), and all links to NULL
, so that if some other code uses the structure after it was freed (which would be an use after free bug ), for example because it cached the pointer somehow, it is easier to detect.
(When you free()
a pointer, the C library usually does not return it immediately to the operating system, or clear it; usually, the dynamically allocated region is just added to the internal free heap, so that a future allocation can just reuse that memory. Unfortunately, it means that if you do not "poison" freed structures, sometimes they can still be accessible afterwards. Such use-after-free bugs are very annoying, and it is definitely worth the "unnecessary work" of poisoning the structures just to help debugging those.)
To facilitate the poisoning, and also to make it easy to remove the poisoning if it turns out to be unnecessary slowdown at some point, it is best to use helper functions for creating and destroying the structures:
static inline struct letter *new_letter(const int letter)
{
struct letter *one;
one = malloc(sizeof *one);
if (!one) {
fprintf(stderr, "new_letter(): Out of memory.\n");
exit(EXIT_FAILURE);
}
one->up_left = NULL;
one->up = NULL;
one->up_right = NULL;
one->left = NULL;
one->right = NULL;
one->down_left = NULL;
one->down = NULL;
one->down_right = NULL;
one->letter = letter;
return one;
}
static inline void free_letter(struct letter *one)
{
if (one) {
one->up_left = NULL;
one->up = NULL;
one->up_right = NULL;
one->left = NULL;
one->right = NULL;
one->down_left = NULL;
one->down = NULL;
one->down_right = NULL;
one->letter = EOF;
free(one);
}
}
I usually include these functions in the header file that defines struct letter
; because they are then tiny macro-like functions, I mark them static inline
, telling the C compiler that they only need to be accessible in the same compilation unit, and that it does not need to generate the functions and call to those functions, but can inline the code into wherever they are called.
Personally, I wrote and verified the above pseudocode using
#include <stdlib.h>
#include <locale.h>
#include <wchar.h>
#include <stdio.h>
struct letter {
struct letter *chain; /* Internal chain of all known letters */
struct letter *up_left;
struct letter *up;
struct letter *up_right;
struct letter *left;
struct letter *right;
struct letter *down_left;
struct letter *down;
struct letter *down_right;
wint_t letter;
};
static struct letter *all_letters = NULL;
struct letter *new_letter(wint_t letter)
{
struct letter *one;
one = malloc(sizeof *one);
if (!one) {
fprintf(stderr, "new_letter(): Out of memory.\n");
exit(EXIT_FAILURE);
}
one->letter = letter;
one->chain = all_letters;
all_letters = one;
one->up_left = NULL;
one->up = NULL;
one->up_right = NULL;
one->left = NULL;
one->right = NULL;
one->down_left = NULL;
one->down = NULL;
one->down_right = NULL;
return one;
}
I prefer to use wide input, because in conforming operating systems you can use any glyphs your locale treats as letters, not just ASCII AZ. All you need to do, is have
if (!setlocale(LC_ALL, ""))
fprintf(stderr, "Warning: Current locale is not supported by your C library.\n");
if (fwide(stdin, 1) < 1)
fprintf(stderr, "Warning: Wide standard input is not supported by your C library for current locale.\n");
if (fwide(stdout, 1) < 1)
fprintf(stderr, "Warning: Wide standard output is not supported by your C library for current locale.\n");
at the start of your main()
, and use the wide I/O functions ( fwprintf()
, fgetwc()
, and so on), assuming you have a standard C environment. (Apparently, some Windows users have issues with UTF-8 support in Windows. Complain to Microsoft; the above behaviour is per the C standard.)
The chain
member is used to link all created letters into a single linked list, so that we can use a function (below) to draw the entire graph in Graphviz Dot language. ( Graphviz is available for all operating systems, and in my opinion, is an excellent tool when developing or debugging code that uses linked lists or graphs.) The circo
utility seems to be quite good at drawing such graphs, too.
int letter_graph(FILE *out)
{
struct letter *one;
/* Sanity check. */
if (!out || ferror(out))
return -1;
/* Wide output. */
if (fwide(out) < 1)
return -1;
fwprintf(out, L"digraph {\n");
for (one = all_letters; one != NULL; one = one->chain) {
fwprintf(out, L" \"%p\" [ label=\"%lc\" ];\n",
(void *)one, one->letter);
if (one->up_left)
fwprintf(out, L" \"%p\" -> \"%p\" [ label=\"↖\" ];\n",
(void *)one, (void *)(one->up_left));
if (one->up)
fwprintf(out, L" \"%p\" -> \"%p\" [ label=\"↑\" ];\n",
(void *)one, (void *)(one->up));
if (one->up_right)
fwprintf(out, L" \"%p\" -> \"%p\" [ label=\"↗\" ];\n",
(void *)one, (void *)(one->up_right));
if (one->left)
fwprintf(out, L" \"%p\" -> \"%p\" [ label=\"←\" ];\n",
(void *)one, (void *)(one->left));
if (one->right)
fwprintf(out, L" \"%p\" -> \"%p\" [ label=\"→\" ];\n",
(void *)one, (void *)(one->right));
if (one->down_left)
fwprintf(out, L" \"%p\" -> \"%p\" [ label=\"↙\" ];\n",
(void *)one, (void *)(one->down_left));
if (one->down)
fwprintf(out, L" \"%p\" -> \"%p\" [ label=\"↓\" ];\n",
(void *)one, (void *)(one->down));
if (one->down_right)
fwprintf(out, L" \"%p\" -> \"%p\" [ label=\"↘\" ];\n",
(void *)one, (void *)(one->down_right));
}
fwprintf(out, L"}\n");
return 0;
}
If the input file is
ABC
DEF
GHI
the Dot description of the graph is
digraph {
"0x1c542f0" [ label="I" ];
"0x1c542f0" -> "0x1c54170" [ label="↖" ];
"0x1c542f0" -> "0x1c541d0" [ label="↑" ];
"0x1c542f0" -> "0x1c54290" [ label="←" ];
"0x1c54290" [ label="H" ];
"0x1c54290" -> "0x1c54110" [ label="↖" ];
"0x1c54290" -> "0x1c54170" [ label="↑" ];
"0x1c54290" -> "0x1c541d0" [ label="↗" ];
"0x1c54290" -> "0x1c54230" [ label="←" ];
"0x1c54290" -> "0x1c542f0" [ label="→" ];
"0x1c54230" [ label="G" ];
"0x1c54230" -> "0x1c54110" [ label="↑" ];
"0x1c54230" -> "0x1c54170" [ label="↗" ];
"0x1c54230" -> "0x1c54290" [ label="→" ];
"0x1c541d0" [ label="F" ];
"0x1c541d0" -> "0x1c54050" [ label="↖" ];
"0x1c541d0" -> "0x1c540b0" [ label="↑" ];
"0x1c541d0" -> "0x1c54170" [ label="←" ];
"0x1c541d0" -> "0x1c54290" [ label="↙" ];
"0x1c541d0" -> "0x1c542f0" [ label="↓" ];
"0x1c54170" [ label="E" ];
"0x1c54170" -> "0x1c53ff0" [ label="↖" ];
"0x1c54170" -> "0x1c54050" [ label="↑" ];
"0x1c54170" -> "0x1c540b0" [ label="↗" ];
"0x1c54170" -> "0x1c54110" [ label="←" ];
"0x1c54170" -> "0x1c541d0" [ label="→" ];
"0x1c54170" -> "0x1c54230" [ label="↙" ];
"0x1c54170" -> "0x1c54290" [ label="↓" ];
"0x1c54170" -> "0x1c542f0" [ label="↘" ];
"0x1c54110" [ label="D" ];
"0x1c54110" -> "0x1c53ff0" [ label="↑" ];
"0x1c54110" -> "0x1c54050" [ label="↗" ];
"0x1c54110" -> "0x1c54170" [ label="→" ];
"0x1c54110" -> "0x1c54230" [ label="↓" ];
"0x1c54110" -> "0x1c54290" [ label="↘" ];
"0x1c540b0" [ label="C" ];
"0x1c540b0" -> "0x1c54050" [ label="←" ];
"0x1c540b0" -> "0x1c54170" [ label="↙" ];
"0x1c540b0" -> "0x1c541d0" [ label="↓" ];
"0x1c54050" [ label="B" ];
"0x1c54050" -> "0x1c53ff0" [ label="←" ];
"0x1c54050" -> "0x1c540b0" [ label="→" ];
"0x1c54050" -> "0x1c54110" [ label="↙" ];
"0x1c54050" -> "0x1c54170" [ label="↓" ];
"0x1c54050" -> "0x1c541d0" [ label="↘" ];
"0x1c53ff0" [ label="A" ];
"0x1c53ff0" -> "0x1c54050" [ label="→" ];
"0x1c53ff0" -> "0x1c54110" [ label="↓" ];
"0x1c53ff0" -> "0x1c54170" [ label="↘" ];
}
(It is in reverse order because I insert each new letter at the beginning of the linked list). circo
draws the following graph from that:
During development, I also check if the linkage is consistent:
for (one = all_letters; one != NULL; one = one->chain) {
if (one->up_left && one->up_left->down_right != one)
fprintf(stderr, "'%c'->up_left is broken!\n", one->letter);
if (one->up && one->up->down != one)
fprintf(stderr, "'%c'->up is broken!\n", one->letter);
if (one->up_right && one->up_right->down_left != one)
fprintf(stderr, "'%c'->up_right is broken!\n", one->letter);
if (one->left && one->left->right != one)
fprintf(stderr, "'%c'->left is broken!\n", one->letter);
if (one->right && one->right->left != one)
fprintf(stderr, "'%c'->right is broken!\n", one->letter);
if (one->down_left && one->down_left->up_right != one)
fprintf(stderr, "'%c'->down_left is broken!\n", one->letter);
if (one->down && one->down->up != one)
fprintf(stderr, "'%c'->down is broken!\n", one->letter);
if (one->down_right && one->down_right->up_left != one)
fprintf(stderr, "'%c'->down_right is broken!\n", one->letter);
}
By consistent linkage, I mean that if a->left == b
, then b->right == a
. Of course, the check cannot tell if a->left
or b->right
is wrong; it can only detect if they are consistent or not.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.