简体   繁体   中英

Passing string Argument, read from a file

I am Trying to find a regex pattern in a text. Let's call the text: the original Text. The following is the code for the patternFinder() program:

vector <pair <long,long> >CaddressParser::patternFinder(string pattern)

{


        string m_text1=m_text;
        int begin =0;
        int end=0;
        smatch m;
        regex e (pattern); 



    vector<pair<long, long>> indices;
    if(std::regex_search(m_text1,m,e))
    {
        begin=m.position();
        end=m.position()+m.length()-1;
        m_text1 = m.suffix().str();
        indices.push_back(make_pair(begin,end));
        while(end<m_length&&std::regex_search(m_text1,m,e))
            { 
                begin=end+m.prefix().length()+1;
                end=end+m.prefix().length()+m.length();
                indices.push_back(make_pair(begin,end));
                m_text1 = m.suffix().str();

            }
        return indices;
    }

    else return indices;
}

I have the following regular Expression :

"\\b[0-9]{3}\\b.*(Street).*[0-9]{5}"

and the Original text mentioned at the beginning is:

  • way 10.01.2013 700 West Market Street OH 35611 asdh

and only the bold text is supposed to match the regex. Now the Problem is when the regex is passed as a string which has been read from a text file the patternFinder() does not recognize the pattern.Though when a direct string (which is identical to the one in the text file) is passed as an argument to patternFinder() it works. Where could this problem coming from?

The following is the code of my fileReader() function which I don't think is very relevant to mention:

string CaddressParser::fileReader(string fileName)
{

    string text;
    FILE *fin;
    fin=fopen(fileName.c_str(),"rb" );
    int length=getLength(fileName);
    char *buffer= new char[length];
    fread(buffer,length,1,fin);
    buffer[length]='\0';
    text =string(buffer);
    fclose(fin);

    return text;

}  

Note that there is an apparent syntactic difference when writing the regex directly into C++ code and when reading it from a file.

In C++, the backslash character has escape semantics, so to put a literal backslash into a string literal, you must escape it itself with a backslash. So to get aa two-character string \\b in memory, you have to use a string literal "\\\\b" . The two backslashes are interpreted by the C++ compiler as a single backslash character to be stored in the literal. In other words, strlen("\\\\b") is 2.

On the other hand, contents of a text file are read by your program and never processed by the C++ compiler. So to get the two characters \\ and b into a string read from a file, write just the two-character string \\b into the file.

The problem is probably in the function reading the string from the file. Print the string read and make sure the regular expression is being read correctly.

The problem is in these 2 lines
buffer[length]='\\0';
text =string(buffer);

buffer[length] should have been buffer[length - 1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM