簡體   English   中英

文件中注釋的字符數(C編程)

[英]The number of characters of comments in a file (C programming)

我似乎無法正確解決所有問題,但是..

int commentChars() {
char str[256], fileName[256];
FILE *fp;
int i;


do{
    long commentCount=0;
    fflush(stdin);
    printf("%s\nEnter the name of the file in %s/", p, dir);
    gets(fileName);

    if(!(fp=fopen(fileName, "r"))) {
            printf("Error! File not found, try again");
                return 0;
    }

    while(!feof(fp)) {
            fgets(str,sizeof str,fp);
            for(int i=0;i<=sizeof str;i++) {
                if(str[i] == '/' && str[i+1] == '/') {
                        commentCount += (strlen(str)-2);
                }
            }
    }

    fclose(fp);

        printf("All the chars, contained in a comment: %ld\n", commentCount);
        puts(p);
        printf("Do you want to search for another file?<Y/N>: ");
        i=checker();


}while(i);}

結果是“所有字符,包含在注釋中:0”,即使我有注釋也是如此。 我的第二個問題是..類似地,對於包含/ * * /的注釋,我該怎么做呢?

對代碼的這種基本的瑣碎修改處理了代碼中的幾個問題。

  1. 你不應該這樣使用feof()而while(!feof(file))總是錯誤的
  2. 您不應該讀取不屬於剛剛讀取的字符串的數據。

我還重構了您的代碼,以便該函數獲取文件名,打開,計數和關閉文件名,並報告找到的文件。

#include <stdio.h>
#include <string.h>

// Revised interface - process a given file name, reporting
static void commentChars(char const *file)
{
    char str[256];
    FILE *fp;
    long commentCount = 0;

    if (!(fp = fopen(file, "r")))
    {
        fprintf(stderr, "Error! File %s not found\n", file);
        return;
    }

    while (fgets(str, sizeof(str), fp) != 0)
    {
        int len = strlen(str);
        for (int i = 0; i <= len; i++)
        {
            if (str[i] == '/' && str[i + 1] == '/')
            {
                commentCount += (strlen(str) - 2);
                break;
            }
        }
    }

    fclose(fp);

    printf("%s: Number of characters contained in comments: %ld\n", file, commentCount);
}

int main(int argc, char **argv)
{
    if (argc == 1)
        commentChars("/dev/stdin");
    else
    {
        for (int i = 1; i < argc; i++)
            commentChars(argv[i]);
    }
    return 0;
}

在源代碼( ccc.c )上運行時,它產生:

ccc.c: Number of characters contained in comments: 58

該評論並沒有真正完成(哎呀),但可以用來說明發生了什么。 它會計算fgets()保留為注釋一部分的換行符,盡管不計算//引入程序。

處理/*注釋比較困難。 您需要先發現一個斜杠,然后是一個星號,然后閱讀下一個星號斜杠字符對。 使用逐字符輸入比逐行輸入更容易做到這一點。 您至少需要能夠將字符分析與行輸入進行交錯。

准備就緒后,可以在程序上嘗試此酷刑測試。 這就是我用來檢查注釋剝離程序SCC的功能(SCC不處理三部曲-通過有意識的決定;如果源中包含三部曲,則我有一個三部曲刪除器,首先要在源頭上使用)。

/*
@(#)File:            $RCSfile: scc.test,v $
@(#)Version:         $Revision: 1.7 $
@(#)Last changed:    $Date: 2013/09/09 14:06:33 $
@(#)Purpose:         Test file for program SCC
@(#)Author:          J Leffler
*/

/*TABSTOP=4*/

// -- C++ comment

/*
Multiline C-style comment
#ifndef lint
static const char sccs[] = "@(#)$Id: scc.test,v 1.7 2013/09/09 14:06:33 jleffler Exp $";
#endif
*/

/*
Multi-line C-style comment
with embedded /* in line %C% which should generate a warning
if scc is run with the -w option
Two comment starts /* embedded /* in line %C% should generate one warning
*/

/* Comment */ Non-comment /* Comment Again */ Non-Comment Again /*
Comment again on the next line */

// A C++ comment with a C-style comment marker /* in the middle
This is plain text under C++ (C99) commenting - but comment body otherwise
// A C++ comment with a C-style comment end marker */ in the middle

The following C-style comment end marker should generate a warning
if scc is run with the -w option
*/
Two of these */ generate */ one warning

It is possible to have both warnings on a single line.
Eg:
*/ /* /* */ */

SCC has been trained to handle 'q' single quotes in most of
the aberrant forms that can be used.  '\0', '\\', '\'', '\\
n' (a valid variant on '\n'), because the backslash followed
by newline is elided by the token scanning code in CPP before
any other processing occurs.

This is a legitimate equivalent to '\n' too: '\
\n', again because the backslash/newline processing occurs early.

The non-portable 'ab', '/*', '*/', '//' forms are handled OK too.

The following quote should generate a warning from SCC; a
compiler would not accept it.  '
\n'

" */ /* SCC has been trained to know about strings /* */ */"!
"\"Double quotes embedded in strings, \\\" too\'!"
"And \
newlines in them"

"And escaped double quotes at the end of a string\""

aa '\\
n' OK
aa "\""
aa "\
\n"

This is followed by C++/C99 comment number 1.
// C++/C99 comment with \
continuation character \
on three source lines (this should not be seen with the -C flag)
The C++/C99 comment number 1 has finished.

This is followed by C++/C99 comment number 2.
/\
/\
C++/C99 comment (this should not be seen with the -C flag)
The C++/C99 comment number 2 has finished.

This is followed by regular C comment number 1.
/\
*\
Regular
comment
*\
/
The regular C comment number 1 has finished.

/\
\/ This is not a C++/C99 comment!

This is followed by C++/C99 comment number 3.
/\
\
\
/ But this is a C++/C99 comment!
The C++/C99 comment number 3 has finished.

/\
\* This is not a C or C++  comment!

This is followed by regular C comment number 2.
/\
*/ This is a regular C comment *\
but this is just a routine continuation *\
and that was not the end either - but this is *\
\
/
The regular C comment number 2 has finished.

This is followed by regular C comment number 3.
/\
\
\
\
* C comment */
The regular C comment number 3 has finished.

Note that \u1234 and \U0010FFF0 are legitimate Unicode characters
(officially universal character names) that could appear in an
id\u0065ntifier, a '\u0065' character constant, or in a "char\u0061cter\
 string".  Since these are mapped long after comments are eliminated,
they cannot affect the interpretation of /* comments */.  In particular,
none of \u0002A.  \U0000002A, \u002F and \U0000002F ever constitute part
of a comment delimiter ('*' or '/').

More double quoted string stuff:

    if (logtable_out)
    {
    sprintf(logtable_out,
        "insert into %s (bld_id, err_operation, err_expected, err_sql_stmt, err_sql_state)" 
        " values (\"%s\", \"%s\", \"%s\", \"", str_logtable, blade, operation, expected);
    /* watch out for embedded double quotes. */
    }


/* Non-terminated C-style comment at the end of the file

我認為您最好使用正則表達式。 它們看起來很嚇人,但是對於這樣的事情,它們確實沒有那么糟糕。 您可以隨時嘗試打些正則表達式高爾夫來練習;-)

我將按照以下方式進行處理:

  • 構建捕獲注釋的正則表達式
  • 掃描文件
  • 計算比賽中的人物

使用一些正則表達式代碼和一些有關在C中匹配注釋的技巧 ,我一起破解了它,這應該允許您計算作為塊樣式注釋/ * * /-包括定界符在內的所有字節。 我只在OS X上測試過。我想您可以處理其余的嗎?

#include <regex.h>
#include <stdio.h>
#include <stdlib.h>

#define MAX_ERROR_MSG 0x1000

int compile_regex(regex_t *r, char * regex_text)
{
    int status = regcomp (r, regex_text, REG_EXTENDED|REG_NEWLINE|REG_ENHANCED);
    if (status != 0) {
        char error_message[MAX_ERROR_MSG];
        regerror (status, r, error_message, MAX_ERROR_MSG);
        printf ("Regex error compiling '%s': %s\n",
            regex_text, error_message);
        return 1;
    }
    return 0;
}
int match_regex(regex_t *r, const char * to_match, long long *nbytes)
{
    /* Pointer to end of previous match */
    const char *p = to_match;
    /* Maximum number of matches */
    size_t n_matches = 10;
    /* Array of matches */
    regmatch_t m[n_matches];

    while(1) {
        int i = 0;
        int nomatch = regexec (r, p, n_matches, m, 0);
        if(nomatch) {
            printf("No more matches.\n");
            return nomatch;
        }
        //Just handle first match (the entire match), don't care
        //about groups
        int start;
        int finish;
        start = m[0].rm_so + (p - to_match);
        finish = m[0].rm_eo + (p - to_match);
        *nbytes += m[0].rm_eo - m[0].rm_so;

        printf("match length(bytes) : %lld\n", m[0].rm_eo - m[0].rm_so);
        printf("Match: %.*s\n\n", finish - start, to_match + start);
        p += m[0].rm_eo;
    }
    return 0;
}

int main(int argc, char *argv[])
{
    regex_t r;
    char regex_text[128] = "/\\*(.|[\r\n])*?\\*/";
    long long comment_bytes = 0;

    char *file_contents;
    size_t input_file_size;
    FILE *input_file;
    if(argc != 2) {
        printf("Usage : %s <filename>", argv[0]);
        return 0;
    }
    input_file = fopen(argv[1], "rb");
    fseek(input_file, 0, SEEK_END);
    input_file_size = ftell(input_file);
    rewind(input_file);
    file_contents = malloc(input_file_size * (sizeof(char)));
    fread(file_contents, sizeof(char), input_file_size, input_file);

    compile_regex(&r, regex_text);
    match_regex(&r, file_contents, &comment_bytes);
    regfree(&r);
    printf("Found %lld bytes in comments\n", comment_bytes);

    return 0;
}
#include <stdio.h>

size_t counter(FILE *fp){
    int ch, chn;
    size_t count = 0;
    enum { none, in_line_comment, in_range_comment, in_string, in_char_constant } status;
#if 0
    in_range_comment : /* this */
    in_line_comment  : //this
    in_string : "this"
    in_char_constnt : ' '
#endif

    status = none;
    while(EOF!=(ch=fgetc(fp))){
        switch(status){
        case in_line_comment :
            if(ch == '\n'){
                status = none;
            }
            ++count;
            continue;
        case in_range_comment :
            if(ch == '*'){
                chn = fgetc(fp);
                if(chn == '/'){
                    status  = none;
                    continue;
                }
                ungetc(chn, fp);
            }
            ++count;
            continue;
        case in_string :
            if(ch == '\\'){
                chn = fgetc(fp);
                if(chn == '"'){
                    continue;
                }
                ungetc(chn, fp);
            } else {
                if(ch == '"')
                    status = none;
            }
            continue;
        case in_char_constant :
            if(ch == '\\'){
                chn = fgetc(fp);
                if(chn == '\''){
                    continue;
                }
                ungetc(chn, fp);
            } else {
                if(ch == '\'')
                    status = none;
            }
            continue;
        case none :
            switch(ch){
            case '/':
                if('/' == (chn = fgetc(fp))){
                    status = in_line_comment;
                    continue;
                } else if('*' == chn){
                    status = in_range_comment;
                    continue;
                } else
                    ungetc(chn, fp);
                break;
            case '"':
                status = in_string;
                break;
            case '\'':
                status = in_char_constant;
                break;
            }
        }
    }
    return count;
}

int main(void){
    FILE *fp = stdin;
    size_t c = counter(fp);
    printf("%lu\n", c);

    return 0;
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM