简体   繁体   中英

strstr() function get the position

There are two text, text a is content and text b is list out the words line by line. The program is to get the position of words from text b in the content.

This is my program:

#include<stdio.h>
#include<string.h>

#define WORDMAXLENGTH 30
#define MAXLENGTH 200

int main(){
    typedef struct{
        char stack[MAXLENGTH][WORDMAXLENGTH];
        int top;
    }stack;

    stack query;
    query.top = 0;
    int i = 0, j = 0,q = 0;
    char myArr[MAXLENGTH];
    char *PosStr = NULL; 
    FILE *inFile = fopen("query.txt","r");
    FILE *inFile2 = fopen("hello.txt","r");

    while(fgets(query.stack[query.top],WORDMAXLENGTH,inFile) != NULL){
        query.top++;
    }

    fgets(myArr,MAXLENGTH,inFile2);

    for(i = 0; i < query.top; i++){
        PosStr = strstr(myArr,query.stack[i]);//get the position of s2 (Q1)
        printf("%d\n", PosStr -  myArr + 1);
    }

    fclose(inFile);
    fclose(inFile2);
    return 0;
}

Q1. Is this equation right? If it is wrong, how can I get the position? If it is right, why I can't get the position correctly? In addition, some of the result of PosStr is 0.

I presumed that the program is intended to check each of the word list in the first file, for occurrence in the single text line of the second file, and with a few tweaks, it works.

I added some error checking and removed the trailing newline from the file inputs. I checked the result of strstr() before printing values based on NULL . I also added another #define to distinguish the size of the stack from the length of the test string, and, I check the stack does not overflow.

UPDATE revises the code to check whole words - case insensitive.

#include<stdio.h>
#include<string.h>

#define WORDMAXLENGTH 30
#define MAXENTRY 200
#define MAXLENGTH 200

typedef struct{
    char stack[MAXENTRY][WORDMAXLENGTH];
    int top;
} stack;

int main(){
    FILE *inFile;
    FILE *inFile2;
    int i, w;
    char myArr[MAXLENGTH];
    char *sptr; 
    stack query;
    query.top = 0;
    inFile = fopen("query.txt","r");
    inFile2 = fopen("hello.txt","r");
    if (inFile == NULL || inFile2 == NULL) {
        printf("Cannot open both files\n");
        return 1;
    }
    while(fgets(query.stack[query.top], WORDMAXLENGTH, inFile) != NULL){
        i = strcspn(query.stack[query.top], "\r\n");
        query.stack[query.top][i] = 0;      // remove trailing newline etc
        if (++query.top >= MAXENTRY)        // check stack full
            break;
    }

    fgets(myArr,MAXLENGTH,inFile2);
    //myArr [ strcspn(myArr, "\r\n") ] = 0; // remove trailing newline etc
    w = 1;                                  // word count
    sptr = strtok(myArr, " \t\r\n");        // removes trailing stuff anyway
    while (sptr) {                          // each word in test string
        for(i=0; i<query.top; i++) {        // each word in library list
            if (stricmp(sptr, query.stack[i]) == 0)  // without case
                printf("%-4d %s\n", w, query.stack[i]);
        }
        w++;
        sptr = strtok(NULL, " \t\r\n");
    }

    fclose(inFile);
    fclose(inFile2);
    return 0;
}

File query.txt :

cat
dog
fox
rabbit

File hello.txt :

A quick brown fox jumps over the lazy dog

Program output:

4    fox
9    dog

The only problem is that fgets() puts the '\\n' in the buffer, thus making strstr() try to match that character too, there are multiple techniques to remove that character, a simple one is

strtok(myArr, "\n");

right after the fgets() it works because strtok() will replace the '\\n' by a '\\0' ., or

size_t length = strlen(myArr);
myArr[length - 1] = '\0';

I'm not entirely sure I understand the question, but I'm guessing that "query.txt" (which gets read into the stack object) consists of lines of words (no longer than 30 chars per line), something like some words some more words words on a line while "hello.txt" contains a single line, the word you are searching for: word And you would like the program to produce the output: 6 11 1 1 for the above input.

As mentioned in comments, the fgets() function will include the terminating '\\n' in the buffer it reads. Also, the strstr() function takes arguments char *strstr(const char *haystack, const char *needle); That is, the first argument is the big string (the haystack) in which you are searching for a little string (the needle). It returns a pointer into the haystack for where to find the needle. Therefore, if I understood your question, the program should become:

#include<stdio.h>
#include<string.h>

#define WORDMAXLENGTH 30
#define MAXLENGTH 200
int
main()
{
    typedef struct {
        char stack[MAXLENGTH][WORDMAXLENGTH];
        int top;
    } stack;
    stack query;

    query.top = 0;
    int i = 0, j = 0, q = 0;

    char myArr[MAXLENGTH];
    char *PosStr = NULL;
    FILE *inFile = fopen("query.txt", "r");
    FILE *inFile2 = fopen("hello.txt", "r");

    while (fgets(query.stack[query.top], WORDMAXLENGTH, inFile) != NULL) {
        query.top++;
    }

    fgets(myArr, MAXLENGTH, inFile2);
        myArr[strlen(myArr)-1] = 0;

    for (i = 0; i < query.top; i++) {
        PosStr = strstr(query.stack[i], myArr); //get the position of s2 (Q1)

        printf("%d\n", PosStr -query.stack[i] + 1);
    }

    fclose(inFile);
    fclose(inFile2);
    return 0;
}

In particular, you were searching for a haystack in a needle, and the needle wasn't actually what you were looking for!

Note: comments beginning with '// --' are reasons for following code changes

#include<stdio.h>
#include<stdlib.h> // exit(), EXIT_FAILURE
#include<string.h>

// --wrap #define number in parens
// --vertical alignment make the code easier to read
// --vertical spacing makes the code easier to read
#define WORDMAXLENGTH (30)
#define MAXLENGTH     (200)

// --place data type definitions outside of any function
// --in modern C, for struct definitions just declare the struct
// --and don't clutter the code with typedef's for struct definitions
struct stack
{
    char stack[MAXLENGTH][WORDMAXLENGTH];
    int top;
};

// --place large data struct in file global memory, not on stack
// contains search keys and number of search keys
static struct stack query;

// --using Georgian formatting for braces makes the code harder to read
// --indent code blocks within braces for readabillity
int main()
{
    query.top = 0;
    // --while legal C, multiple variable declarations on same line
    // --leads to maintenance problems and reduces readability
    int i = 0;
    // -- eliminate unused variables
    //int j = 0;
    //int q = 0;
    char myArr[MAXLENGTH]; // line to search
    char *PosStr = NULL;   // ptr to where search key found

    // --always check the returned value from fopen to assure operation successful
    // --always place the literal on the left in comparisons
    // --    so compiler can catch errors like using '=' rather than '=='
    FILE *inFile = fopen("query.txt","r");
    if( NULL == inFile )
    { // then fopen failed
        perror( "fopen for query.txt for read failed" );
        exit( EXIT_FAILURE );
    }

    // implied else, fopen successful

    FILE *inFile2 = fopen("hello.txt","r");
    if( NULL == inFile2 )
    { // then, fopen failed
        perror( "fopen for hello.txt for read failed" );
        fclose(inFile);  // cleanup
        exit( EXIT_FAILURE );
    }

    // implied else, fopen successful

    // --the following while loop can
    // -overflow the available space in the struct
    // --leading to undefined behaviour and can/will lead to a seg fault event
    // --comment the code so reverse engineering is not needed
    // note: each search key in the struct field: stack[]  will be terminated with '\n'
    //       so eliminate them
    // read in complete file. line-by-line to struct
    // while tracking number of lines
    while(fgets(query.stack[query.top],WORDMAXLENGTH,inFile))
    {
        query.top++;
        strtok(myArr, "\n"); // replace newline with NUL char   
    } // end while

    // --always check returned value from fgets
    // --to assure the operation was successful
    // read line to search
    if( NULL == fgets(myArr,MAXLENGTH,inFile2) )
    { // then fgets failed
        perror( "fgets for hello.txt file failed" );
        exit( EXIT_FAILURE );
    }

    // implied else, fgets successful

    for(i = 0; i < query.top; i++)
    {
        // --strstr will return NULL if search string not found
        // --always check returned value from strstr (!=NULL) to assure successful operation
        PosStr = strstr(myArr,query.stack[i]);//get the position of s2 (Q1)

        if( PosStr )
        { // then at least one instance of current search key found in line
            // --difference between two pointer is a 'long int', not an 'int'
            // display offset into line
            printf("%ld\n", PosStr -  myArr + 1);
        } // end if
    } // end for

    fclose(inFile);
    fclose(inFile2);
    return 0;
} // end function: main

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM